K8 Operator - Monitoring and ScyllaDB Deployment Demo Part Two

13 Min to complete

The second part of the hands-on demo. How to deploy ScyllaDB with Kubernetes, including some troubleshooting and general questions. This is followed by running an example app running cassandra-stress and checking how the ScyllaDB cluster handles the traffic in the monitoring dashboards.

Transcript

okay, so first node of this cluster is booting up, let’s see whether the io setup I
tried to disable, has really been disabled so let’s check the logs from the ScyllaDB container
and ScyllaDB is shutting down because bad io scheduler configuration
– somebody mentioned that io.conf is a wrong copy, did you copy the io.conf as the wrong copy?
okay that’s possible
let me double-check
yes, I saved ioproperties, okay so we can fix it really fast
so let’s remove the ScyllaDB cluster, again
let’s release the PVC, then let’s remove the config map
called io.conf
but I lost
the content of the config map so let’s save it somewhere
yeah this is wrong
okay, so let’s recreate this config map
from file
okay and now we can apply the cluster again
so because the image is already downloaded it should start immediately
and it did, so let’s check whether
for some reason it’s not booting up again
okay, so maybe it’s still using the wrong one, – yeah, looks like it is the wrong one, – so now
we have time for troubleshooting, – it’s a bonus – yeah, it’s a bonus, let’s remove again everything
let’s check whether our ioproperties and io.conf are
good, they are, okay, so now let’s delete the configmaps
and let’s recreate them
– another question, what does removing the PVC do? – so this basically tells Kubernetes
to release the disk which was pre-allocated to the previous pod
so the disk is going to be cleared and it will go back to the pool of available disks
– yeah, so PVCs there are a bunch of PVCs which we sort of
pre-created and kept it available, which just get attached to different pods, right? – yes
okay, I hope this is fine now, let me double-check the cluster is tuned
so the version is correct, we disabled the iosetup, the ioproperties is mounted in
right direction, io.conf is also right, so let’s try again
let’s check the logs
and it’s again shutting down, okay so this trick should work, I’m not sure why
it’s not working, so let’s quickly disable this trick and forget about it
in order to proceed with the presentation
so most likely I did something wrong in the scripts, okay I will fix it later and publish
the entire scripts, so let’s again delete the ScyllaDB cluster, and then at the PVC
let’s recreate them
and they should boot up
so the image is downloading that’s fine
– okay, maybe there are any questions yeah, so there’s one more
could you expand on the comment about making changes when you have existing data, what
happens when you have existing data and you’re changing it, that I guess changing the topology
changing the instance types, what happens? – so when you change the number of members
the ScyllaDB Operator will act accordingly, so if you decrease the number the ScyllaDB Operator will
first, decommission the node in order to properly leave the cluster and then it will delete the pod
but if you increase this number, the ScyllaDB Operator will basically create another pod with the desired
number of nodes, you can also change/add some racks to the cluster and ScyllaDB Operator will
initiate creation of such, currently the changing of resources is not yet supported but it’s going
to be in one of the next releases, if you want to change the storage, it is also not supported
yeah, so basically the scale operations are supported and of
course, you can change some config files and after I restart it, these will be reloaded
– got it – okay, let’s see how the pods are looking, it’s still spinning up, okay but let’s move on
yeah, so I’m going to generate some traffic to this cluster and to spawn the Cassandra stress
I’m going to use something called Kubernetes job so this is basically a resource where you can
specify a one-time job or a chrome job to have some application running on a scheduled time
so I’m going to use a job which is going to execute the Cassandra stress
with the write workflow and. I’m passing a single node from the cluster
and the rest of them will be discovered using the protocol, so we can use the DNS name of the pod
the name of the namespace and the cluster will be discovered based on this name
so let’s double-check whether our pod name is correct, it’s correct and I’m going to allocate
six CPUs and 18 gigabytes of memory just for Cassandra stress and I’m going to use the pool dedicated for
application traffic, so this tells Kubernetes which node to select for the pod scheduling
but of course, we need the cluster to be running and because this trick with disabling
io setup didn’t work, so it needed to benchmark the disk from the beginning and that’s why
it took three minutes to put up a node but because the first one is already ready, so we can
we can start the traffic and ScyllaDB should survive, – you’re writing it with what
consistency level, I don’t know if you’re doing write traffic or read traffic
– write traffic with one consistency level – yeah, should be good, let’s try it out
– yeah
okay, so the job was created, we can check whether it’s running
by checking the logs of the pod
it is still creating because the image needs to be downloaded, so let’s wait a couple of seconds
there is a question about the ram usage
– yeah, it might not be directly related to Kubernetes or Operator but
in generally about how ScyllaDB uses its ram, yeah so 50% of the ram is allocated for memtables
and memory and the other 50% is allocated for key caches, row caches,
row index caches as well and this is out of the 93 percent of the total ram on the instance
so what happens if ram is running out and changes, so typically it’s an LRU cache
so whatever is the recent cache, that’s kept and the other is churned
– okay, going back to the presentation
I initiated the traffic and as you can see, the metrics are coming in, we are having just a single
available node, we are able to process 90 000 of requests with sub 1 millisecond 99% latency
yeah, so this is pretty good and meanwhile one of the nodes is still joining
okay, so this is how you deploy ScyllaDB and this is how you deploy the application on
top of Kubernetes

Previous Topic

Back to Lesson

Next Topic

K8 Operator – Monitoring and ScyllaDB Deployment Demo Part Two

13 Min to complete

About

Resources

Documentation

Keep in Touch