What should we do if a replica becomes unbalanced?
Covers the latencies dashboard, common issues that can cause high latencies in a node, the cache replica dashboard, MV and Memory replica views.
We checked that everything was green next we go to the replica imbalance and here life is a bit
more complex so we’re going to talk about three or four things that you can
do and then we’ll fall down to everything else but hopefully we’ll be
able to provide additional input in the future
the first item I go to and look at is a latencies okay so we added those are
exposed and the latency is if I look at chart view especially if I’m looking at
historical the system is a bit slower the latencies are a bit higher then I
can pinpoint the time I can pinpoint the node that started having those and
so forth and then look at what’s happening so when we talk about overview
of latencies again we need to look at the instance and the shard view higher
read latencies on replication factor of nodes or shards can indicate an
imbalance on data access so we said before that if we have a hot
shard or a hot partitions and you’re sending requests to a specific shard in
the system and how does that translate to latencies sees so it can’t
translate to latency so if you’re using the seed or shard aware driver if not
it’s a bit more complex but if you are then it will be perfect for that and
then the latencies of those shards on those three specific nodes will be
higher than everything else in the system and that’s a very good indication
that you have a hot shard case another option is you have an imbalance so with
times you are accessing larger partitions and then you need to go in
check that if you have spikes which are very high temporarily in your latencies
usually on a single shard not on all the shards or not in all the nodes then you
need to go to the single node check and that can be a stall I’ll talk about it a
bit later but that’s a good indication of something happening on that specific
node and we need to check that specific node and last if we have higher reader
agencies again you may be sending more read requests to those specific shards
they may be doing CQL processing or foreground starting the cql processing of
those requests before they split up that’s why there are a bit slower
so you can look at the traffic itself next is look at, we have a detailed view
in in ScyllaDB monitoring and one section is a replica so if we look at the
replica we can find specific items related to replicas imbalance and I’ll
talk about items that you can find here so we have active sstable reads and
queued sstable reads so active sstable reads means that how many sstables are
we currently reading and in many cases it will be 0 in something some systems
it may be constant but not going up and down and not spiking on a specific node
and so forth if it is spiking up on a specific node it may be it may mean that
that note is not keeping up with compactions okay so we need to go and
check that single node that has a higher number of active sstable reads
queued sstable reads translate to latencies so queuing up sstable
reads and ScyllaDB will start queuing up after 100 sstable reads going down to
the desk simply because we’re not adding additional requests without providing
responses for previous requests or will max out the capacity then it is an
indication that we may be overloading the system okay or that specific node
has an issue it may be have a stall or something else or the disk is slower for
some reason that is not a node so we need to do the single node check right
currently blocked and dirty on commit logs that has to do with the fact that
we have inbound rights and we’re writing them to the disk and we need to check
what’s going on if they’re the counters are not zero and usually the counters
will be zero then we need to do the single node check again and see why that
node is not being able to flush down is it the commit logs or the main tables
to the disk reads fails write failed again go to the
single note check writes timeout if they’re imbalanced we need to go to the
single node check again detailed cache replicas so the cache view on the
detailed monitoring has many a lot of information I’m not going to talk
about every specific item I’m going to talk about the first items that you need
to look at so they are metrics reads with no misses partition hates row hits and
if it’s imbalanced and I’ve seen systems that are imbalanced you’ll have a
single shard that is much higher than everything else so if that is a case
it’s a hot partition 99% of the cases it may be it’s a hot partition please
note it may be a hot partition due to background process or queries internal
to the ScyllaDB as well or it can be due to even the drivers doing a pool from
from the nodes so it may not be even your application it can be a driver
query that is causing that another item that is interesting is the total bytes
if they are the same for partition and rows between the shards and if it’s not
it means that we don’t have the same number of partitions
but we have the same amount of memory in the cache and that means that we have a
partition that is larger or a row that is larger so in that case we should go
to large partition row cell check last there is a
materialized view memory replica I’m going to talk about the memory ones so
in case of imbalance you may see that the LSA is dropping some of the
customers I guess have senior and it usually means that ScyllaDB had to
evacuate memory or follow say Alice a memory is what we use for mem tables and
cache so we fix that memory in order to build up queues usually queues of ongoing
items and if we’re building up a queue that usually
bad the system in a stable state should not have a lot of queues building up it
should be LSA will usually be I don’t know majority of the memory will
be in LSA if we’re talking about eight gigabytes per shard less than one gigabyte
will be a non LSA memory and it will be constant in that manner if there is a
spike drop or a large drop it means something is bad with that node and we
need to go in check why usually it means that we’re building up to you somewhere
and that will translate to the system being slower something not working so