This is where our enterprise feature,
which is called workload prioritization kicks in.
So the open source service levels exist,
but they don’t isolate workloads.
So if, for example,
you have two sessions with different configuration,
it will not necessarily work as expected because
both of those requests will be queued on the same queue
and then batch workloads will typically enlarge the queue
because they have high tolerance for timeouts.
And then eventually they will cause some kind of
priority inversion because the interactive requests
that are squeezed between those requests
will start to time out.
So effectively batch workloads
will gain more
implied importance, so to speak.
Even multiple interactive workloads,
if they have slightly different timeouts
demands, they will not be isolated.
So if one of those real time workloads starts to get very,
very big
or to queue a lot of requests because as I said,
the concurrency sometimes is effectively unbounded,
it will cause also the other
workload to start failing or start timing out.
Why is it still good for Enterprise?
So why is it still good in open source?
Because they can be used alternately,
and they will still help you to moderate the query traffic.
So if you’ve defined
some of the workloads as interactive,
they will be treated differently and will increase
the chance that you won’t get into a situation of live lock
so you can use them alternately.
Define several of them, but make sure that every given time
only one workload runs.
How do we isolate workloads with workload prioritization?
We actually divide the queues on the resources.
How do we implement it?
We just have to support another
session property,
which is called “shares”, and it represents the relative
piece or writes
that a specific workload has on the resource
at the presence of a conflict.
I will talk about it a little bit later,
but we are doing the isolation on four different parts.
One is the CPU, so processing time.
The second is internal concurrency.
We do have concurrency moderation
mechanisms in Scylla, even in open source,
but they are still like employed on specific queue
so, one workload can still starve another by exhausting
this internal concurrency.. So here they are separated.
We do it on the IO level.
So a workload that has more
shares will get also more IO,
and we also do it on the memory level.
So you cannot execute a query that potentially
take enough
memory in order to block a query from a different workload.
Isolation and system utilization.
So what I’ve just said is a little bit disturbing
because let’s say that we set one of the workloads to
have 20% writes on the CPU
so you don’t want that even when no other workload is
running.. This process will only use 20% of the CPU.
Imagine this large computation or OLAP that actually,
it will just want to consume as much CPU
time as you will give it.
So you don’t want to have
the rest of the CPU or disk stay idle.
So as far as a concurrency
CPU and also
IO, we do
a full utilization of the system, which means that
if there are only part of the workloads in the system,
they will share the resources according to the shares.
So it only applies for active workloads, which means that
if there is only a single workload,
even if defined in the lowest priority possible,
it will get all of the resources it needs from the vacant ones.
CPU and IO.
For memory, it’s not true right now at least.
And the partitioning is static.
From our experience, memory, at least when you
give a reasonable amount of
memory to each process is not really
a limiting factor.. It might be, but
most of the time there is more than enough
to serve at least several queries. So
unless you do something extreme like defining one
process or workload with one share and the other with 1000,
so as long as the ratios are reasonable,
there will be enough memory to serve queries.
But still we probably in the future would like it
to be also fully utilized.. Some example for when you would
want to use this isolation capabilities.
If you have a critical workload that can be starved by others.
So you would want to give it a very high share number and
that way it will be relatively isolated from the rest.
If you have different classes of users,
that should get different SLAs “service level agreements”.
So one user can tolerate a 2 seconds
of a timeout and the other only several milliseconds.
So you would want to isolate that in a way
that they will get
those serving times
to the best of the system’s ability.
One scenario that we actually encountered
in the field is isolating table writes
which have “Materialized Views”, so, materialized views
adds more pressure to the write.
So every write to the base table actually implies
actually implies that
you will have several writes going on at the time.
So what appears to be an innocent write
can be actually a very large number of eventual writes.
Imagine like deleting a partition
of a materialized view.
So the deletion itself is very fast on the base table,
but since the materialized view
has a different partitioning key, it can be a lot of sparse
deletion queries to the materialized view.
So it actually can be multiplied by a lot.
So if you put writes
to the base table on its own workload,
you will eventually isolate it
and it will affect a lot less on the rest of the system.