Some of the best practices for using ScyllaDB and Kubernetes are:
Use Namespaces. Namespaces provide a scope for names. Namespaces cannot be nested inside one another and each Kubernetes resource can only be in one namespace.
Define Taints and Node affinity.
Define the CPU Manager policy correctly, typically to “static”.
Use fast Disks.
Because we are trying to bridge Kubernetes concepts and ScyllaDB concepts, I thought it would be
very useful to mention the best practices on both sides, so on the Kubernetes side and on the
ScyllaDB side, so let’s start with Kubernetes, so the first thing you notice on our documentation
and our examples, is that when you have multiple environments and all these different workloads, so
applications, databases, monitoring, this and that it’s very useful to categorize or separate
those different workloads and maybe not only workloads but also environments, so maybe you
have a development environment, a QA environment a production environment, so separate all those
those things using namespaces and they’re only that, they’re just a label
separation, so in our examples you see that we have a ScyllaDB name space, for our ScyllaDB nodes
an Operator namespace, for our Operator, one for monitoring, one for ScyllaDB Manager and of course
if you need a database, you probably have an application that it’s going to use that database
in the case of our examples on GitHub, we are using a benchmarking tool, which is called Cassandra Stress
and you see that we have a Cassandra. Stress namespace, but you can use whatever you want
also in Kubernetes we have this concept of node affinities and taints, so basically a node
affinity it’s something that you put on your Kubernetes nodes that are going to attract
certain types of pods that have that same affinity and a taint it’s the opposite, right, so taint
you’re basically saying, hey don’t schedule any pods on these Kubernetes nodes
and unless they have a toleration, right so if those pods have a toleration that matches that
taint, they’re going to go there, so if you look at our examples, you see that for example the ScyllaDB
nodes they have tolerations and in that example we have Kubernetes nodes with those taints so
we ensure that the only thing that it’s running on those Kubernetes nodes, are the ScyllaDB pods because
you can incur, when we are using something like ScyllaDB in noisy neighbor, problems
and so on, so we try to make it as isolated as possible in something like Kubernetes, right
another concept from Kubernetes is, the CPU. Manager, right so Kubernetes has a CPU Manager and
it has two possible policies, the first one is ‘none’ which is the default one and it’s basically going
to slice those CPUs in cycles and distribute those cycles accordingly to the pod needs
right, but we prefer that you use ‘static’ that is going to ensure that you can
that your pods will be granted entire CPUs and that they’re going to be dedicated only
for that pods and the way to do that after you have the static set on your settings
is to, when you’re specifying resources for a pod, you if you’re familiar with
Kubernetes, you know we have two sections limits and requests, if they’re the same
the CPU Manager will understand that you want those CPUs to be guaranteed QOS
and you get those CPUs for your pod completely dedicated and isolated
from all their workloads, this is something that you should do on the Kubernetes side
to make sure you get the best performance when you deploy a ScyllaDB cluster, also disks
right, so by all means, don’t use network attached storage because if you’re doing that
because you’re worried about persistence, about not losing data, ScyllaDB is already
a highly available distributed database that takes care of things like replication for you, so
because the total speed of a system is going to be its lowest part and that’s usually the disks
we recommend that you use the best disk possible that you have available either SSDs or NVMes
and if you have multiple, use RAID0 for that and lastly, because it’s Kubernetes and you want
everything to happen automatically, use a dynamic local provisioned, so you have a
dynamic provisioner on Kubernetes and when you try to deploy your ScyllaDB pods, you just specify
the storage class name for that particular provisioner and how much disk do you want
and the provisioner is going to assign those disks or persistent volume claims to your ScyllaDB pods