CPP Driver – Part 1

What is the C/C++ Driver And How To Get It

The C/C++ driver sometimes referred to as a “connector” or “client library”, is a piece of software that allows your C/C++ applications to talk to ScyllaDB clusters.

In this lesson, we’ll go over the driver, which is compatible with both Scylla and Cassandra, and see an example of how to use it to connect to a Scylla cluster and perform basic operations.

The driver – although written in C++ – exposes a C interface via functions and structures wrapped in extern “C”. The C language is not only fast, mature, and available on most platforms – it can also be called from other programming languages via C bindings. This means that once you master the driver’s API, you will be able to harness Scylla’s high performance from a language of your choice, on virtually any OS.

The C++ driver can either be installed system-wide, from packages (*.deb, *.rpm), or built from the source code. In this lesson, we will briefly go through both possibilities. The code samples are marked as C++, but the only C++ specific functionality used is <iostream>, so you should have no problem with “porting” them to C.

Note: at the time of writing, the latest release of the cpp-driver is 2.15. We’ll use this version throughout this lesson.

Setting Up the Scylla Cluster

In the setup, we will create a three-node Scylla cluster with one keyspace, a table. We’ll then populate it with some data. Follow this procedure to remove previous clusters and set up a new Scylla cluster.

The first task is to create the keyspace:

docker exec -it mms_scylla-node1_1 cqlsh
CREATE KEYSPACE ks WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy','DC1' : 3};

Now that the keyspace is created, it is time to create the table.

use ks;

CREATE TABLE IF NOT EXISTS ks.mutant_data (
   first_name text,
   last_name text,
   address text,
   picture_location text,
   PRIMARY KEY((first_name, last_name)));

Now let’s populate the table with some data using the following statements:

INSERT INTO ks.mutant_data ("first_name","last_name","address","picture_location") VALUES ('Bob','Loblaw','1313 Mockingbird Lane', 'http://www.facebook.com/bobloblaw');
INSERT INTO ks.mutant_data ("first_name","last_name","address","picture_location") VALUES ('Bob','Zemuda','1202 Coffman Lane', 'http://www.facebook.com/bzemuda');
INSERT INTO ks.mutant_data ("first_name","last_name","address","picture_location") VALUES ('Jim','Jeffries','1211 Hollywood Lane', 'http://www.facebook.com/jeffries');

Exit the cqlsh:

exit

Installing the C/C++ Driver 

First Option: Installing from Packages (Linux)

This is the easiest method. It requires some knowledge of package management on your system. To install the latest version of the driver along with the dependencies, follow this link. To install the version, we use in this lesson (2.15), use this link.
For example, on Fedora 30:

# Example: installing C/C++ driver 2.15.0 on Fedora

wget https://downloads.datastax.com/cpp-driver/centos/7/cassandra/v2.15.0/cassandra-cpp-driver-2.15.0-1.el7.x86_64.rpm
wget https://downloads.datastax.com/cpp-driver/centos/7/cassandra/v2.15.0/cassandra-cpp-driver-devel-2.15.0-1.el7.x86_64.rpm
sudo yum --nogpgcheck localinstall cassandra-cpp-driver-2.15.0-1.el7.x86_64.rpm cassandra-cpp-driver-devel-2.15.0-1.el7.x86_64.rpm

Installation on Ubuntu 20.04 is even easier, as all the dependencies are already there by default. We’ll silence an error about missing multiarch-support, by adding the –force-all option to dpkg ( which is generally not recommended):

# Example: installing C/C++ driver 2.15.0 on Ubuntu 20.04

wget https://downloads.datastax.com/cpp-driver/ubuntu/18.04/cassandra/v2.15.0/cassandra-cpp-driver_2.15.0-1_amd64.deb
wget https://downloads.datastax.com/cpp-driver/ubuntu/18.04/cassandra/v2.15.0/cassandra-cpp-driver-dev_2.15.0-1_amd64.deb
sudo dpkg --force-all -i ./cassandra-cpp-driver_2.15.0-1_amd64.deb ./cassandra-cpp-driver-dev_2.15.0-1_amd64.deb

Second Option: Installing the Source Code (Linux/UNIX)

This method requires a bit more effort but allows for fine-tuning, for example, using new features and bug fixes not included in the latest releases. It’s useful if you want to experiment with multiple versions of the driver or when you don’t have permission to install custom packages. Another use case is if you’re familiar with C++ toolchains but not with yum/apt, rpm/dpkg, dependencies, repositories, etc.
The process is described in-depth here, so we will not repeat all of it. Once you compiled the driver (let’s assume that <SOMEPATH>/cpp-driver/build/libcassandra.so was created), download this example and build it. When invoking gcc, you will have to specify the:

1) location of libcassandra.so

2) location of cpp-driver’s header files

3) where to search for libcassandra.so when the binary is run

# Compiling sample program (simple.c) and linking it with a custom build of C/C++ driver

gcc simple.c <SOMEPATH>/cpp-driver/build/libcassandra.so -Wl,-rpath,<SOMEPATH>/cpp-driver/build/ -I <SOMEPATH>/cpp-driver/include/ -o simple

The command should create a binary, which you can try to run after spinning up Scylla first. If the compilation succeeds, but you are unable to run the binary, this is likely because your binary cannot find libcassandra.so at runtime – in such a case, you should check that -rpath points to the right location or set environment variable LD_LIBRARY_PATH to point to that location.

Note for Windows hackers: C/C++ driver “packaged” for Windows is just a ZIP archive with compiled libraries and the header file. We will not go into details, but it’s enough to install the dependencies, drop the contents of the archive into your project and point Visual Studio to the library/include paths.

Connecting to the Cluster

Let’s go over the code used to connect to the cluster we created. You can find the code in  /scylla-code-samples/cpp/part1/connect.cpp.

As you can see, the C/C++ driver’s API often returns a pointer to CassFuture, which, roughly speaking, represents “a piece of information that doesn’t exist yet.” If you are not familiar with the concept of futures, it’s a very powerful programming model that allows for efficient use of CPU, especially when I/O operations are involved.
However, to keep things simple, throughout this lesson, we will restrict the use of futures only to the blocking operations; that is, we will discard futures’ strongest point.

Edit the file connect.cpp and change the IP according to the setup of your cluster. Now compile and run the code:

g++ connect.cpp -lcassandra -o connect
./connect

The execution may produce some output on your console due to the negotiation of the protocol version. Unless the connection failed, it’s nothing to worry about.

It’s easy to forget to free the allocated objects or to leak them if an exception is thrown. In C++, it’s recommended that you automate their deletion with some implementation of “scope guard,” e.g., BOOST_SCOPE_EXIT or unique_ptr’s custom deleter. You can see this in /scylla-code-samples/cpp/part1/connect_unique.cpp:

Querying

In the example above, you might have noticed a “session” object. This will be the central object in our everyday work with the C/C++ driver. Under the hood, CassSession maintains per-node connections and a tunable pool of I/O threads to query according to the load-balancing policy. Because CassSession is thread-safe, it is generally recommended that you create one session per keyspace and share it among your application threads.

Let’s see how to read the mutant dataset we previously wrote. You can find this code in /scylla-code-samples/cpp/part1/query.cpp.

As you can see, CassResult consists of CassRows, and each CassRow consists of a number of CassValues. Rows and Values are just “views” into the cells of Result, and therefore their destruction is handled alongside the destruction of Result. In other words, we don’t have to free them in our code.

Compile and run the code (again make sure you change the IP according to the setup of your cluster):

g++ query.cpp -lcassandra -o query
./query

In the API reference, you will find all the functions that retrieve other data types from CassValue, such as int, float, uuid, collections, UDTs, etc. We can also run non-select queries in a blocking manner: add tables, insert or delete data, create users, alter keyspaces, drop indexes, … – all of that from C/C++, for example:

Iterators

Our query “SELECT […] FROM ks.mutant_data;” should have returned three rows, and we read only the first one. To iterate through all the rows, we will need – you guessed it – an iterator. Instead of calling cass_result_first_row(result) and retrieving a cell from it, we will traverse all the rows and access the first_name column in each of them. This code is from  /scylla-code-samples/cpp/part1/iterator.cpp.

Compile and run the code (make sure you change the IP according to the setup of your cluster):

g++ iterator.cpp -lcassandra -o iterator
./iterator

Summary

In this lesson, we learned how to install and use the C++ driver to interact with a Scylla cluster using an example. 

The driver exposes a C interface. It’s fast, mature, and available on most platforms.  It can be used from other programming languages using C bindings, which means that you can use it with a language of your choice, on virtually any OS.

fa-angle-up