Migrating from Cassandra to ScyllaDB is a common use case. Covers basic considerations like dual-writes, out of order writes, TTL and WRITETIME, and Lightweight Transactions (LWT)
Now that we
have learned about the migrator, let’s understand how migration
from Apache Cassandra would look like, as well as some aspects
that you should be aware of before you actually start.
Because even though it’s very easy for you to use the migrator
there are some hiccups that you must be aware of.
So, one of the
most asked questions when it comes down to an Apache
Cassandra migration is why is “dual-writing” enough
and what would happen if we had “out-of-order writes”?
Some other questions are: what if my data is TTL, and how will
the migrator preserve the TTL formed data; and finally,
what if I am
using “Lightweight Transactions” in my application?
So let’s start with the first two questions.
But before we do, let’s understand the problem
in order to understand what we mean by those questions.
So now imagine that you have configured
your application to do what we call “dual-write”.
Which means that for every incoming
upsert, delete, update statement that you run,
you will also propagate that to your ScyllaDB cluster.
All right, then, as we have seen in the migrator,
we are going to hook-in the migrator
which will start reading
from your source database in just case Apache Cassandra
and it will start writing to the ScyllaDB cluster.
Now, here it goes a question to you
to see if you guys are paying attention
and to wake you up, consider the not so unlikely scenario
when they
migrator reads a piece of data from the source database
and right before it has a chance to write the data to Scylla
your application receives an update and updates
that same piece of data
before they migrator has a chance to write the data.
Then after, the migrator will go in and update the data.
What’s going to be the ending result in the target database.
Good.
Very good.
Leandro said, the migrator overwrites
the row with the previously saved data.
Very good way of thinking, Leandro.
But you are wrong.
And that’s essentially the “gotcha”,
which I pretty much prepared just for you.
So the purpose of this question
was to essentially make you guys
think about all the things, you know,
the hiccups
that may happen, during
a migration and everything that you need to consider.
So to answer this question and Lucas
got the explanation correctly, but to answer the question,
we needed to understand a bit how the CQL,
the Cassandra query language protocol works under the hood.
At the timestamp, on when the query actually happened.
In this timestamp, it will be recorded into the server.
As a result of mismatching timestamps
The database will always select the queries
which have a higher timestamp.
This is a concept to know as “Last-Write-Wins”
which can also be explained as: the write
with the highest timestamp is the one which will prevail.
Under normal circumstances, the answer of Leandro
would be the one correct,
because from the flow that we saw over here,
the migrator was the last one to write the data to the cluster.
Right.
But as. I have explained it earlier,
the migrator has several features.
And one of these features is for it to actually preserve
the timestamp, also known as the “writetime”
when it’s reading data from the source table.
And on top of
preserving the writetime, it can also preserve
the TTL value of the source database columns.
One important thing for you to pay attention here
is that if you are handling, if you are working with complex
data types such as collections or non-frozen UDTs
by default, the migrator won’t be able to infer the timestamp
nor the TTL of every single column
that exist
in those inside of those complex types.
And that, in the order to overcome this – and again,
this is not a migrator limitation,
this is actually a CQL protocol.
It’s a protocol limitation.
But in order to overcome this, if you are working with complex
data types, you can specify a hardcoded
time to leave in the hardcoded timestamp
which the migrator will use when it’s writing your data.
All right.
So, how this explanation explains why
dual-writing work for handling an actual migration
from the application perspective?
And it also explains how the migrator works under the hood
to ensure that your data in your migration
is actually handled correctly
when you are migrating from Apache Cassandra.
All right.
We then got to the last concern
when you are migrating from Apache Cassandra,
which is when your application
essentially relies on lightweight transactions.
So if you are working with lightweight
transactions, things change a little bit.
Let’s consider the situation in which you again,
because Felipe, we just learned that it’s safe
to enable dual-write from the application
and you decided to trust Felipe blindly.
So you decided to enable dual-write from your application
and then let’s suppose the following happens:
as transaction comes in and it fails on the source system,
but it successfully gets applied on the target system.
Well, that’s great.
You may think, as at least our record got persisted
as we initially wanted to the target database.
Right. Well,
not really.
Not only the transaction failed in your existing source
on truth, which is terrible so you needed to fix it.
But now let’s consider the opposite happens.
That is, the transaction succeeds on the source system,
but fails on the target system.
Guys, any ideas on how we can overcome this?
I see birds.
I hear birds singing in the background.
But yeah, let’s move on.
Otherwise we won’t have time for the demo.
Sorry.
So the answer to
this question, Asmund. That’s a very good thought.
Yeah.
So the answer to this question is very simple
and Asmund actually gave us the right track.
So if you are using lightweight transactions,
you may end up in situation where you simply cannot do
dual-write from the application perspective, okay?
So as a result,
keep your application writing to a single source of truth
at all times, and during the migration steps
you can either do it in several rounds or consider
publishing your successful transactions through,
I don’t know, “Kafka topic” or ETCD.
As Asmund suggested,
or as we have seen, the migration overview,
you can also rely on CDC
to ensure that your transactions are in sync.
Okay.
So yeah, that’s an edge case in case you are
using lightweight transactions, that’s a situation where
you cannot do dual-write.