This lesson goes over how the ScyllaDB Alternator project, the Open Source Amazon DynamoDB-compatible API, works under the hood. It covers differences between ScyllaDB and DynamoDB, Data Modeling, implementation details, how to migrate from DynamoDB to ScyllaDB Alternator, and a summary.
Ok so now let’s spend a few
minutes just on how Alternator works, what it does. So I want to talk about how it works
and where it still differs from DynamoDB and we still need to fix that and you can
also see, if you get the slides, you can see here a link, that we have a much more detailed document
about the design of Alternator and all the different things we needed to do and how it differed
from ScyllaDB. Ok, a bit about the DynamoDB. API, the DynamoDB API is basically JSON requests
and responses over HTTP or HTTPS, so for example a CreateTable has this header that says it’s a
a CreateTable and then all sorts of parameters like the name of the table, the keys and
things like that, and you have response and all the DynamoDB APIs look like that.
I said before, Alternator is not a proxy layer, it’s not a separate cluster, it’s part of the ScyllaDB cluster
so you don’t need to size it differently or something like that, it’s part of the ScyllaDB cluster
and each of the ScyllaDB nodes also answers DynamoDB. API requests, the DynamoDB API is implemented
using calls to internal functio ns in our code and our internal RPC, it’s not translation into CQL
which is both important for efficiency and also will allow us to do things which perhaps
don’t have an immediate translation into. CQL. Because we have a lot of different
ScyllaDB nodes, the client actually needs to send the request to one of them so you need some
sort of HTTP load balancer or DNS to point the client to the different nodes and do
the load-balancing. Now a bit about the DynamoDB data model, so DynamoDB has tables
and each table has partitions, partition has a hash key and then a lot of items and each
item has a Sort key and attributes and if it looks familiar it’s because it’s very similar
to what ScyllaDB has, just different names – the partitions also called partitions, hash keys are partition keys and the
Sort key is a clustering key so it’s very similar to what ScyllaDB has, the one difference
is that DynamoDB doesn’t have a schema, it has a schema for the hash key and the
Sort key but not for the attributes, the attributes can be anything, unlike in ScyllaDB
and it’s basically, you can think about it like a JSON document, because
the attributes can also be nested, you can have a list, document inside one attribute
So what we had to do, we basically emulate these attributes as a map, as a ScyllaDB map and
this allows us to do concurrent updates to different top-level attributes
like you can do also in DynamoDB.. Another thing when
you consider the DynamoDB data model is the. DynamoDB natively supports, and is very common
common to use it, Read-Modify-Write updates, so you have conditional updates like “name” mentioned
in LWT, so let’s say I want to set A to 2 but only if A is now equal to 1, it gives you counters and you can
copy attributes, set A to value of B and in DynamoDB this was very easy for them to implement because all writes
anyway do a read because of the way their implementation works. The problem is that ScyllaDB actually works
differently, ScyllaDB was designed to be efficient for doing writes without needing to do reads
Kostja also mentioned that, so ScyllaDB’s writes are very efficient but they make Read-Modify-Write
a lot more complicated. So our current and temporary implementation actually does
read and then write, and of course this is not a great idea because it’s not safe for
concurrent operations, I mean if you do this and then another one
does a = 3 if a = 1, only one of these should succeed, you can’t do a read and then have
two of these changes succeeded at the same time, so what we will have to do and we’re
planning to do is to use LWT that we now will have to do these Read-Modify-Write operations
but currently if you want to use Alternator you have to be aware that these Read-Modify-Write
operations are not safe if you do them concurrently. You can see in the source code we
have a file Alternator.md, which has a detailed status of what is supported and what is not currently
and we also have a bug tracker, you can see as we are progressing exactly what is
supported in the current open source versions, and as I said, several DynamoDB applications
already work unmodified on Alternator and some of the things we have to address soon for the GA
first of all, some operations and parameters of existing operations which we still haven’t completed
supporting, things which we didn’t need for the initial applications, we just postponed later
but we have to do those too, we need to do the safe concurrent Read-Modify-Write operations
we don’t support yet on-demand backup feature of DynamoDB and we can easily
do this using ScyllaDB’s existing backup feature and we don’t support yet DynamoDB streams
but now that we’re also working on CDC, like we saw on a different talk, will use that also for the
DynamoDB API support. Ok so if you want to try to migrate your application
from DynamoDB to Alternator, first of all either install ScyllaDB and a load balancer, like I mentioned
or use ScyllaDB Cloud, then tell your application, which already uses DynamoDB
just have to tell it the endpoint address of the load balancer and it will send all the request
to ScyllaDB, and remember this is a preview release so watch out for unsupported features
and the unsafe Read-Modify-Write operations and if you already have existing data on DynamoDB
we also have migrator within Spark, that can help you migrate the data, take it out
of the DynamoDB and put it into ScyllaDB using the DynamoDB API on both reads and writes
To summarize, ScyllaDB is a very efficient, reliable and low latency NoSQL data store that
began with Cassandra compatibility and of course still this is our main supported API
but the idea of the Alternator project is to also add DynamoDB API compatibility
it already can run existing applications designed for DynamoDB and of course we’ll improve this
compatibility as we continue to develop this, you can run this like ScyllaDB on any Cloud
or data center, not just on AWS, it is open-source you can already get it now, it’s already available
for more than a month and it’s now also available as a Managed Database
Service on ScyllaDB Cloud