Note that some of the features covered in this lesson are available to ScyllaDB Enterprise users only.
Covers OLTP and OLAP types of workloads, what is the conflict between these and how can they work together? Workload Prioritization in ScyllaDB and how we solve the problem.
Transcript
Basic concepts, workloads can be divided roughly
into two main classes, one is online transaction processing (OLTP), it is characterized by small work
items that span narrow portion of the data it is something like operations done on web
that generates small queries or something like this and this is why its latency sensitive
it is client-facing, so you don’t want the latency to be high. On the other side we have
online analytic processing (OLAP) which is kind of the opposite, so it involves either a large
work items or just a batch of work items and it is throughput oriented, because it’s hard to know when
it is going to end, so you just want to server it to do as much jobs as possible at any given time
it is usually things like scans and aggregates so it spans a lot of the data, it will probably
take a long time and resources. So those differences have consequences, the main one is
it’s really hard to opt for both on the same data center, so what happens if you just try
is what you see here, so when OLTP lives happily alone on the server it enjoys the great
latencies that ScyllaDB has to offer and we all know but as soon as all our workload kicks in
they become indistinguishable and the latency spikes and both of them actually converges
to the same latency, so both suffer. OLAP suffers a little less, because it is throughput oriented
and ScyllaDB naturally gives a lot of throughput which OLTP doesn’t need
so OLAP suffers a little less in this case so I use some of these animations to emphasize the problem
this one shows OLTP workload, what we see here is
that the jobs come sporadically they are very little
jobs, they don’t fill the queue, each job, as you see on the counters
waits very little time in the
queue if at all and this is why you can get lower latencies, of course for this to happen
your cluster has to be underutilized, or you just have a lot of throughput but with high latency
as we can see for OLAP, OLAP actually generates a lot of work items, so what happens here is that you get
very good throughput because at any given time the server performs the work but each individual
work item that gets queued waits for the whole queue to be emptied so this
naturally will give per work item high latency.
so this is what happens when we mix them
OLAP has a lot of work items so the queue is always full of them and then
when this sporadic work items, as a result of OLTP, come in it have to wait all queue until it
get processed, so it’s actually observes the same latencies that OLAP sees
but it bothers you more because it is client-facing and it’s latency sensitive