Project Alternator Basics – Lab

12 min to complete

Project Alternator Basics – Lab

Alternator is an open-source project that gives ScyllaDB compatibility with Amazon DynamoDB.

In this lesson, we’ll start by introducing the project. Afterward, we’ll see a hands-on example of creating a one-node ScyllaDB cluster and performing some basic operations on it. 

You’ll start ScyllaDB by running a single instance using Docker.

Please ensure that your environment meets the following prerequisites:

  1. Docker for Linux, Mac, or Windows. Please note that running ScyllaDB in Docker is only recommended to evaluate and try ScyllaDB. For best performance, a regular OS install is recommended.
  2. 3GB of RAM or greater for Docker.

Note: In addition to the instructions provided here, which allow you to run the lab on a machine with Docker, you can find this lab in the Killercoda learning environment here. The Killercoda environment provides an interactive virtual machine where you can execute all the commands directly from your browser without the need to configure anything.

The goal of this project is to deliver an open-source alternative to DynamoDB, deployable wherever a user would want: on-premises, on other public clouds like Microsoft Azure or Google Cloud Platform, or still on AWS (for users who wish to take advantage of other aspects of Amazon’s market-leading cloud ecosystem, such as the high-density i3en instances). DynamoDB users can keep their same client code unchanged. Alternator is written in C++ and is a part of ScyllaDB. 

You can read more about it in this blog post and in the documentation

The three main benefits ScyllaDB Alternator provides to DynamoDB users are:

  1. Cost: DynamoDB charges for read and write transactions (RCUs and WCUs). A free, open-source solution doesn’t.
  2. Performance: ScyllaDB was implemented in modern C++. It supports advanced features that enable it to improve latency and throughput significantly.
  3. Openness: ScyllaDB is open-source. It can run on any suitable server cluster regardless of location or deployment method. 

Setting up a ScyllaDB Cluster 

Before starting the cluster, make sure the aio-max-nr value is high enough (1048576 or more). 

This parameter determines the maximum number of allowable Asynchronous non-blocking I/O (AIO) concurrent requests by the Linux Kernel, and it helps ScyllaDB perform in a heavy I/O workload environment.

Check the value: 

cat /proc/sys/fs/aio-max-nr

If it needs to be changed:

echo "fs.aio-max-nr = 1048576" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf

If you haven’t done so yet, download the example from git:

git clone https://github.com/scylladb/scylla-code-samples.git

Go to the directory of the alternator example:

cd scylla-code-samples/alternator/getting-started

Next, we’ll start a one-node cluster with Alternator enabled. 

By default, ScyllaDB does not listen to DynamoDB API requests. To enable such requests, we will set the alternator-port configuration option to the port (8000 in this example), which will listen for DynamoDB API requests.

docker run  --name some-scylla   --hostname some-scylla -p 8000:8000  -d scylladb/scylla:5.2.0    --smp 1 --memory=750M --overprovisioned 1 --alternator-port=8000 --alternator-write-isolation=always

Wait a few seconds and make sure the cluster is up and running:

docker exec -it some-scylla nodetool status

In this example, we will use the Python language to interact with ScyllaDB with the Boto 3 SDK for Python. It’s also possible to use the CLI or other languages such as Java, C#, Python, Perl, PHP, Ruby, Erlang, Javascript. 

Next, if you don’t already have it set up, install boto3 python library which also contains drivers for DynamoDB:

sudo pip install --upgrade boto3

In the three scripts create.py read.py and write.py, change the value for “endpoint_url” to the node’s IP address. 

Create a Table

We’ll use the create.py script to create a table in our newly created cluster, using Alternator.

Authorization is not in the scope of this lesson, so we’ll use ‘None’ and revisit this in a future lesson. 

We define a table called ‘mutant_data’ with the required properties such as the primary key “last_name,” a String data type. You can read about Boto 3 data types here

The DynamoDB data model is similar to ScyllaDB’s. Both databases have a partition key (also called “hash key” in DynamoDB) and an optional clustering key (called “sort key” or “range key” in DynamoDB), and the same notions of rows (which DynamoDB calls “items”) inside partitions. There are some differences in the data model. One of them is that in DynamoDB, columns (called “attributes”), other than the hash key and sort key, can be of any type and can be different in each row. That means they don’t have to be defined in advance. You can learn more about data modeling in Alternator in more advanced lessons. 

In this simple example, we use a one-node ScyllaDB cluster. In a production environment, it’s recommended to run a cluster of at least three nodes. 

Also, in this example, we’ll send the queries directly to our single node. In a production environment, you should use a mechanism to distribute different DynamoDB requests to different ScyllaDB nodes, to balance the load. More about that in future lessons. 

Run the script: 

python create.py

Each Alternator table is stored in its own keyspace, which ScyllaDB automatically creates. Table xyz will be in keyspace alternator_xyz. This keyspace is initialized when the first Alternator table is created (with a CreateTable request). The replication factor (RF) for this keyspace and all Alternator tables is chosen at that point, depending on the size of the cluster: RF=3 is used on clusters with three or more live nodes. RF=1 would is used if our cluster is smaller, as is in our case. Using a ScyllaDB cluster of fewer than three nodes is not recommended for production. 

Performing Basic Queries

Next, we will write and read some data from the newly created table. 

In this script, we use the batch_write_item operation to write data to the table “mutant_data.” This allows us to write multiple items in one operation. Here we write two items using a PutRequest, a request to perform the PutItem operation on an item. 

Notice that unlike ScyllaDB (and Cassandra, for that matter) in DynamoDB, Writes do not have a configurable consistency level. They use CL=QUORUM. 

Execute the script to write the two items to the table:

python write.py

Next, we’ll read the data we just wrote, again using a batch operation, batch_get_item. 

The response is a dictionary with the result, the two entries we previously wrote. 

Execute the read to see the results:

python read.py

DynamoDB supports two consistency levels for reads, “eventual consistency” and “strong consistency.” You can learn more about ScyllaDB consistency levels here and here. Under the hood, ScyllaDB implements Strongly-consistent reads with LOCAL_QUORUM, while eventually-consistent reads are performed with LOCAL_ONE.

More Resources 

Conclusion

In this lesson, we learned the basics of Alternator: the open-source DynamoDB ScyllaDB API. We saw how to create a cluster, connect to it, write data, and read data. Future lessons will cover more advanced topics and more interesting examples, including data modeling, backup and restore, single region vs. multi-region, streams (CDC), encryption at rest, and more. 

 

fa-angle-up