In a previous lesson, we explained how to create a sample Python application that executes a few basic CQL statements with a ScyllaDB cluster using the Python driver. However, that code was too static and not scalable. In this lesson, we’ll see how to optimize the existing Python code with prepared statements. This will allow us to build better performing applications that are more scalable.
What Are Prepared Statements?
A prepared statement is a query that is parsed by ScyllaDB and then saved for later use. One of the valuable benefits is that you can continue to reuse that query and modify variables in the query to match variables such as names, addresses, and locations. Let’s dive a little deeper to see how it works.
When asked to prepare a CQL statement, a client library will send a CQL statement to ScyllaDB. ScyllaDB will then create a unique fingerprint for that CQL statement by MD5 hashing the CQL statement. ScyllaDB will use this hash to check its query cache to see if it has already cached that CQL statement. If ScyllaDB had seen that CQL statement, it will send back a reference to that cached CQL statement. If ScyllaDB does not have that unique query hash in its cache, it will then proceed to parse the query and insert the parsed output into its cache.
The client will then be able to send an execute request specifying the statement id and providing the (bound) variables, as we will see next.
Now, let’s go over a Python application that uses prepared statements.
The Python Code
We will use the file prepared_statement_app.py, which you can find in the directory: scylla-code-samples/mms/python/python-app-ps
We will first go over each section of the code before running it.
To get started, we will create two prepared statements in the App constructor:
The first prepared statement is named insert_ps. This statement adds data programmatically for first_name, last_name, address, and picture_location based on input from the application. The second prepared statement is called delete_ps, and it deletes entries in the table, based on input for first_name and last_name. These statements are reused later on to add and delete data in the mutant_data table.
The add_mutant function is defined as follows:
This method takes input for first_name, last_name, address, and picture_location and then executes the prepared statement named insert_ps using these parameters. By using prepared statements, we can rerun this method to add data to the catalog table. This significantly improves performance.
The delete_mutant function is defined as follows:
In this method, we take as inputs first_name and last_name and use them as parameters for the delete_ps prepared statement. Again, by reusing the prepared statement, we improve performance.
Finally, the main function passes input to the functions when the application starts. It is defined as follows:
First, the contents of the catalog table are displayed. Next, we add two additional mutants using the add_mutant function. We then show the contents of the table again to see that the mutants were indeed added. Finally, each mutant that was added is deleted, and the contents of the table are shown once more.
After understanding the code, let’s run the sample application in the Docker container we previously set up.
Conclusion
In this lesson, we explained prepared statements and how they work with the Python driver to run applications more efficiently.
Stay tuned for more content!