In this hands-on demo, you will deploy ScyllaDB Monitoring to gain insights into what’s happening in the cluster.
Next, you will deploy the ScyllaDB cluster using the Operator.
Now we are ready to deploy the ScyllaDB, but first let’s deploy the monitoring because
we want to have some observability, so because it takes around
five minutes to deploy the monitoring, I did it before this presentation but all the
commands that you have to execute can be found on our ScyllaDB Operator documentation, under setting
up monitoring tutorial, so what I did, I just copy pasted all of these commands and Prometheus and
Grafana were installed, I also imported the ScyllaDB monitoring dashboards from the main repository
yeah so just copy paste these commands and it should deploy your permit Prometheus
okay, so now I’m going to forward the traffic to the Grafana pod, using port forward command
and it should be accessible using localhost
okay now I just need to login
okay so
all of the data are empty of course, because we haven’t deployed any cluster yet
so let’s proceed and let’s deploy the ScyllaDB cluster
okay, so I’m going to first apply the cluster definition and then I will
describe what this file contains, just to speed up the things
okay, so now let’s see how the cluster definition looks like, so this file contains
a namespace we are going to use, some service account role and role binding, so these are
the stuff needed in order to give permissions to the Kube API server for the Sidecar
and so let’s focus on the most interesting stuff, which is ScyllaDB a cluster definition
so this resource has some name and will be deployed in the ScyllaDB namespace
and here we have some parameters for the cluster like the ScyllaDB version
like the Sidecar version, cpuset so this basically tells ScyllaDB to pin
the CPUs it’s going to use, this basically gives some performance boost, you can also specify
some sysctls to tweak the kernel, so basically here I’m increasing the number of maximum I/O
we have the host networking enabled, this also gives some performance boost, so I enable it and
we have a topology, so our data center is going to be called us-west-1 and we are going to deploy
a single rack right now, we are having only a single node, we will change this value later
and here is the storage requirements, so we are going to use
the local nvme drives attached to the instance of a capacity of 700 gigabytes
and our resources, we are going to allocate 3 CPUs and 12 gigabytes of memory
and the same for resources requests, so on production environments it is important to
to have both limits and request to be equal, so this basically allows
the kubelet to set the quality of service to the current guaranteed
so this basically means that these CPUs won’t be used by the other ports, so basically
these three CPUs will be guaranteed dedicated just for ScyllaDB so
on development environments, it’s not that important so you can mix and match these
values but make sure to set them to the same value on the production environments and there is also
a placement definition, so this field basically tells your cloud provider where these
pods should be scheduled, so this is what my cloud provider, which is GKE, understands
and it is going to schedule the pods only on the nodes which are located in the us-west-1b zone
and the other placement we have are tolerations
so this basically allows this pod to be scheduled on the note I dedicated to ScyllaDB
there are also the ScyllaDB config and ScyllaDB agent config fields
so user may overwrite some default values using config maps, if they want to
so basically this config file will be merged to the actual config file used by the ScyllaDB
okay, so this file is already applied, we can check whether the pods inside namespace ScyllaDB
are ready and two of them are, two of the containers are, so basically ScyllaDB and
the Sidecar are ready to go, let’s check up the logs from the pod to see some interesting stuff
and if you scroll to the beginning, which will take some time on this font size
okay, so you can see that the http checks are set, some config file, the topology
definition is being set up, so all of this stuff are basically done by the Sidecar which is
doing the discovery of the Kubernetes resources and said it’s setting up the correct configuration
the correct topology settings and there is also one log entry about starting evaluation this
may take a while and actually, it does, so what it does, it runs a benchmark against the disk
and it takes around one or even two minutes on every pod, so if your cluster is very big
like say thousands of nodes, ten thousands of nodes
waiting twenty thousand minutes just for disk benchmarking is pretty much a waste of the time
so because we are using the same machine types and we are using the same disks for every ScyllaDB node
I can save up the benchmark this pod did on the same disk we are going to use
on different nodes and mount this benchmark results as a file to the other ScyllaDB nodes
to skip this benchmark and save some time, so I’m going to show you how to do that
yeah, so these benchmark results are saved in two files, so let’s save the content from them
so in order to get the content we need to execute a command inside
the pod, so we are going to use kubectl exec
in container ScyllaDB, we are going to check the content of etc/scylla.d/io_properties,yaml
so this file contains how many read_iops disk is able to hand, how many write_iops, what is the
write and read bandwidth and there is also a second file called io.conf, which basically tells ScyllaDB
where this benchmark is located so let’s save these files and
let’s reuse them to spawn more nodes and remember this trick only works if all of your ScyllaDB nodes
have the same hardware and same disks, if you have different hardware for each of your
ScyllaDB node, don’t use it because ScyllaDB needs to know how fast the disks are
okay, so currently we saved the benchmark as a file, so now we have to import these
benchmark results to the Kubernetes, so that later we can mount them as a volume
so I’m going to create a config map from them
create
called ioproperties,yaml
from file ioproperties,yaml
and the same config map for the second file
okay
so we can now check how to mount these files into the cluster, so I prepared a copy of this
previous cluster, please note that this trick only works from 4.3
so we can disable the io setup from 4.3. ScyllaDB and in order to do that we need to pass
some additional argument to the ScyllaDB image, so this is done by using ScyllaDBArgs field and passing by
the string of arguments we want to add, so in this case we are going to pass io setup with zero value
and we also need to mount these files we previously created to the
pods of ScyllaDB, so this is done by specifying additional volumes attached to the ScyllaDB pod
and we are going to create two volumes, first one is going to be called ioproperties
the source of it will be a config map of ioproperties with ioproperties name
and the second volume will be for the second file and we need to mount this
volume in the proper location, so we are going to specify volume mounts and the
ioproperty one will be mounted under etc/scylla.d/ioproperties,yaml and the second one under ioconf
so basically that’s how you provide some additional stuff to the ScyllaDB
so let’s remove this previous cluster by deleting the ScyllaDB cluster object
delete
we have all the new ones, so let’s delete all of them and we also need to release the persistent
drive, so we need to remove the PVC, ScyllaDB. Operator doesn’t delete PVC because, it basically
it’s your data, so we don’t want to delete it if you don’t want it, so you have to do it manually
okay so now we may apply the tuned cluster
let’s check whether it’s booting up
so it is initializing, this means that
most likely the image is being downloaded, so I will check the chat maybe there are any questions
no questions as of now, one question. I have though, so for persistent volumes
your recommendation is generally we use raid 0, right? on ScyllaDB instances – Yes
for here, like how when we are mounting, how do we create these persistent volumes
like, do we still take a few and then do just a bunch of disks on them or
I prepared the disk before the presentation and this is, basically this storage class
name selects which disk we are going to use, so some persistent volumes have their class name
and I prepared our local disks which are basically raid zero
and called the class which are describing it as local-raid-disks
so there is basically a demo set which is doing everything
to make this raid, so I just applied this raid the demo set and it created a local raid
disk storage class, – got it, got it so we’ll share this with
– yes, later, all these scripts will be available after the session