Overview of Admin Procedures and Monitoring

So my name is Moreno I’m a Solutions Architect for two years already so little heads up this is going to be intense I have like 90 slides and I’ll be covering those 90 slides in like 30 minutes so I’ll probably sound like a maniac on steroids but here’s the thing usually when we go to a customer to deliver this training this session would be four hours so I’ll be counting on you to you know to go through exercises and do your homework I’ll just try to give you a flyover you know so you can see the land and then you after we land and then you go and you explore for yourself so I’ll be talking about admin tools the goal here is to show you the most common tools the Scylla administrators should use and to point to you where to find the resources the documentation the manual pages help you name it give a man a fish you feed him for a day you know the rest right so let’s start with nodetool and logs so nodetool is the command line interface for managing a node so there are two types of nodetool commands the ones that are going to give you information about that node or the cluster sometimes and the ones that are going to actually do something so I’ll show you some examples for nodetool status informative command it’s giving the status of all the nodes you know if it’s up/normal down/normal leaving joining moving I’m not going to go into specifics of nodetool commands first because there’s a gazillion nodetool commands second each one of them have sub-options

So sometimes you can see by the node but you can drill down to a key space sometimes to a table so nodetool info nodetool cfhistograms this is actually interesting even though the screenshot is not very interesting but if it was a normal table you would see on these histograms you would see the distribution of writes reads the partition sizes very nice you find a you know a big partition that kind of stuff table stats as you can see only this command for getting statistics of a table it can be that big so it would depend on what you’re looking for I always tell people nodetool is okay sometimes we need to go into that your monitoring should have told you before you needed nodetool so again several options of nodetool like this one might be familiar if you want to run a major compaction you’re on your cluster if you’re cleaning up after a new node join and so on and so forth so in logs because Scylla runs on Linux and after a long battle some of you might be familiar or not systemd won the war and now we are using journalctl just because it’s what comes with systemd the nice thing about journalctl is that you can filter the log messages

So for example on the first one we are filtering by Scylla server which is the name of the service that Scylla runs but you can you know drill down you can filter by priority by date you know there’s million tricks you can do with journalctl by all means go to the journalctl man page there’s like a gazillion options and whenever we ask you for logs because you’re working with the Scylla team let’s say you’re customer you’re user doesn’t matter you have a problem let’s say you’re your cluster is running for one year and you had a problem yesterday please don’t send me one year of one year worth of logs just send me you know like two last days something like that so it’s easier to pinpoint where we are finding you know those problems or those errors so these are the takeaways for nodetool and journalctl be familiar with the documentation in the man pages after you have this presentation you see there the links are there so it’s easy to follow always check your overall cluster health before you do anything experiment with nodetool on a controlled environment before running things on production so my main advice for you guys and you’re going to see that on the hands-on exercises please get a docker container please have you know one little cluster on development try things there before you start typing commands in production I’ve seen some bad stuff happened just because people are you know not careful so try to limit the log output to meaningful periods use all the filters also look at the monitoring and use CQLSH to test your hypothesis let’s keep going CQL,monitoring so the CQL shell is the easiest most common it’s like the the first door that you open when you start using Scylla because it’s great you get your packages you install, you run Scylla set up Scylla is working and then what’s next you want to create a key space you want to create a table you want to insert some data usually people do that with CQL and/or they use for example whatever programming language that they love you can we have drivers and connectors for pretty much everything but CQL is a good way to prototype to you know get a feel if you’re new to Scylla or anything like NOCQL or Cassandra so you can explore a little bit

These are you know some of the options CQL usually is interactive so you run CQLSH and then you start typing commands so you can describe key spaces tables you can check the consistency that CQL is using by default I think it’s consistency level of one but you can change to quorum, local quorum, all whatever and use the help page of course these are more options this one I want you to pay attention because “-e” is to avoid interactive mode so I just wanted to describe my schema I use CQL -e desc schema it will give an output I can pipe that into a file send it to the Scylla team and “-f” let’s say you have you know 200 inserts you want to do put it on a file -f file it’s going to run all those CQL statements inside the file so -f for a file – e to avoid interactive mode so again control the environment docker docker developed dev environment whatever don’t run things in production first, just going back here you can set the timeout let’s say you are doing a select star from table you have 2 million records it’s not going to finish in you know five seconds running on your docker so increase the timeout if you’re doing something like that 

But by all means don’t plan your queries for you know big times right one of the reasons you’re using Scylla you want it to be fast so try to design your queries accordingly – e – f by all means have fun I had a lot of fun when I joined Scylla I didn’t know anything about CQL it took me like two days  just playing around and I was already familiar with it for more complex environments and simulations don’t use CQLSH you want to experiment with replication factor, consistency level you want to see latencies you want to ingest a lot of data CQL is not for that CQL is for you know very specific stuff and playing around if you want to do simulation by all means use something like Cassandra stress or create some code in your favorite language, monitoring all I want to explain is how the monitoring works so please pay attention to this the monitoring is easy to use our instructions to install to deploy monitorings like this it takes ten minutes tops if you’re doing it slow if you’re just copying and pasting it can be two minutes so you have Scylla running on the nodes okay Scylla by default we’ll have metrics on port 9180 okay  9100 is for node exporter so Scylla is exporting the Scylla metrics and node exporter is exporting the OS metrics so there’s two sets of metrics two ports every time then our monitoring solution is comprised of Prometheus Alertmanager and Grafana they’re all accessible through your browser so usually Prometheus is running on port 9090 and Grafana on port 3000 so Prometheus is constantly querying both those ports in each of the nodes every time Prometheus reaches out it will get the metrics it’s going to save them Prometheus the time series data base

Okay after the metrics are in Prometheus because we have a gazillion metrics I kid you not I’ll show you on the next slide a little bit but we have more metrics than you’re ever going to use it’s better to have the metrics and not need them than need them and not having them right so Prometheus it’s going to store that and then Grafana created dashboards and they’re beautiful and they’re useful they’re the best thing so you go to Grafana we will have like six different dashboards I think I have a note here with all their names or is it yeah so we have overview, detailed, CPU metrics OS metrics IO in Scylla CQL so when when you go to our Grafana you see that on the top we have all those selectors so usually the way you look at the monitoring is you start very high level so you can look at the entire cluster and then you can select just one DC out of this cluster here right you choose one cluster then one DC then out of that DC you can choose one node and you can even drill down to the shard level 

So that’s usually what we do when we’re troubleshooting problems I recommend you do the same even if you’re not troubleshooting anything by all means look at Scylla when you’re doing things you’re ingesting data take a look at the monitoring it’s going to tell you a lot about the Scylla inner workings you’ll see you know how CPU is being used how memory is being used how i/o is happening how often compactions are kicking in so by all means use that, this is how the monitoring configuration file looks like it’s basically a list of nodes for one cluster in a particular DC that’s it and we even have a script that’s going to generate this file for you, you just listed the the cluster name the DC the nodes it’s going to generate the yaml for you just so you don’t get in trouble because come on it can get messy right  if you dealt with yamls you know what I’m talking about our monitoring can be run on docker that’s what I recommend you use because it’s just so easy but if you already have your own Prometheus and you own grafana you’re ready using that to monitor other stuff in your company it’s pretty simple you just add the nodes to Prometheus and you run that script that comes with our monitoring load.grafana and it’s going to load all the dashboards into your Grafana

Scylla metrics remember I mentioned we have a gazillion metrics this is like page one at out of a billion I don’t know but if you go to any Scylla node port 9180/metrics this is what you’re going to see so you can see that the metrics themselves are here but we also have like a help page and the type of metric so in this case this one is a counter and this is a total number of sent messages so this is how if you need one of the metrics that it’s not there for some reason this is how you find what a particular metric is or maybe you’re doing you know the some reverse engineering you went into our Grafana and then you looked and you saw the name of the metric and you want to know what it is, you just look there okay this is an example of alertmanager remember that I mentioned this here so alertmanager is a plug-in for Prometheus and this is basically how you set an alert there’s a what is the name of the file the rules.configure it yep so you can use any of the Prometheus metrics to generate an alert the threshold is up to you some people want to monitor for one value some people to a different value so you set here what is the metric that you’re using what is the threshold that you want the severity of the alert how frequent it should you know look into Prometheus for that and the description is going to show up in the integration so you have alertmanager you know integrated with Prometheus looking at the metrics and you can configure alertmanager to send alerts with pretty much everything so email pager duty slack telegram you name it 

So those are our integrations that are available for alertmanager yeah and the reason we shipped some some alerts already with our stack is just that you have some examples right so you can look at what’s there and based on that you can create your own the rule_config.yml I’m always the one that I showed before and the the word that I was looking for is the receiver which you can integrate like slack pager duty and we have an example here for emails so this is what it looks like and that is the mail that we’re receiving common problems is  generally problems with the monitoring comes down to port or things not running so the first thing you should do if you’re running on docker it’s you know docker PS – a for example and look if the containers are up and running if you’re running on your own stack by all means take see if Prometheus is running Grafana is running and that all the ports are open you know between the monitoring and the nodes they’re those two ports that I mentioned in in the beginning and among them so Prometheus is it’s up on port 9090, Grafana on all the ports are there troubleshooting is pretty easy usually it’s just a matter of looking at netstat and look at the firewall rules some more examples I showed you for Scylla on port 9180 for the Scylla metric same thing for node exporter but different port as I mentioned in beginning keep your monitoring stack up to date because we’re constantly improving dashboards because well people have problems when it’s really hard to troubleshoot a problem and we feel like we’re missing a panel on a dashboard we are going to add it 

So by all means keep the monitoring up to date it’s going to benefit specify a data dir for Prometheus because Prometheus is going to save everything has files right so make sure you specify a specific mount point or file system for the Prometheus data because if in the future you’re upgrading Prometheus or you’re moving it to a different machine you know exactly where those files are so it’s easy  to move easy to migrate this is what I mentioned before always look at the monitoring it doesn’t matter if you don’t have a problem excellent look at your workloads because then you can predict if in six months from now you’re going to run out of disk if you know maybe three nodes won’t be enough so always keep a look at the monitoring and create alerts that are important for your application don’t create alerts for things that doesn’t matter for you just create you know alerts for the really important things let’s say you are you’re running tight on storage so put an alert for storage I don’t care about latency don’t create an alert for latency otherwise it’s just annoying and when you have one important one you’re going to miss it, we will ask you for monitoring data every time you come to me and you say “Moreno  I have a problem” if it’s not something basic like I mentioned let’s look at the yaml file maybe there’s a misconfiguration here if it’s a serious problem it’s a performance problem the first thing I’m going to ask you is give me your monitoring data and there’s two ways you can do that 

You can take screenshots it’s okay I will look but then I cannot manipulate the data the best way is get remember the data dir that I mentioned for Prometheus get those files send those to me because then I can replay the data on my own computer and then I can you know drill down I can do lots of things with the data screenshots okay and this is a good experiment whenever you’re using the monitoring try to get the monitoring data put it  on your laptop try to replay the data it’s fun Cassandra Stress, Cassandra Stress it’s a tool that came from Cassandra to generate load on your cluster so you can ingest data so let’s say you want to have a dataset with 300 million records and you want to see how it behaves while you ingesting the data use cassandra stress for that we use cassandra stress extensively if you go to the Scylla website most of our benchmarks use Cassandra stress some YCSB yeah this is this is what you used to test a real use case so because you can use you know basic comments like this one it’s going to write I think it’s ten billion records with consistency level of one using sixteen connections five columns of 64 bytes repetition factor of could use for reads or you can create your own personalized workload and you can specify your schema so you can say you know what I have this columns with this data type the size of this information could be from X to Y you can get complex very fast but the good thing is that you can simulate exactly what you want for your application and then you can experiment with that 

So let’s say you know I’m not sure about which compaction strategy to use I’m kind of on the fence you know what run two workloads with different compaction strategies and everything else is the same you see which one is more advantage so user-defined mode is great for simulating this you know real workloads sometimes Cassandra stress has some limitations that you cannot get past in this case you have to create your own code do your own hack but usually it solves most of the problems then you can test different types of workloads as I mentioned consistency level replication factor compaction strategies maybe you’re trying a new feature from Scylla maybe you want to test workload prioritization maybe you want to test incremental compaction strategy go for it and again use it in a controlled environment don’t go don’t there’s no reason no good reason I can come up for running Cassandra stress on a production cluster this is for simulating stuff you don’t want to use that in production so please documentation right there there’s a gazillion options as usual, tracing so tracing enables analyzing of internal data flows in a Scylla cluster that’s not necessarily true because well okay useful for observing behavior of specific queries to check you know network issues data transfers replication factor problems CQL’ed tracing 

So there’s two types of tracing that’s the thing that you I want you guys to understand this is client-side tracing so CQL is doing the tracing and it’s every step of the way when it’s communicating with Scylla it’s it’s bringing you some information about that you can do the same if you’re using any programming languages all of them they have tracing there you can enable it never by default right because that would be bad because it’s going to hurt your performance, CQL tracing it’s store on the system_traces this is an example so remember CQLSH is interactive so I just type tracing ON and after that everything I do on CQL it’s going to bring me the result but also the entire tracing session so if you’re having problems you, know exactly what is the query or the insert that is causing you problems you have the partition key you can experiment here okay we also have probabilistic tracing so,  probabilistic tracing is something that you enable with nodetool and of course one it’s 100 it’s a hundred percent of whatever you’re doing so be very careful with this number here remember that one it’s a hundred percent so this is dot zero one percent okay and again “but Moreno” what number should I set right so you should have in mind your own workload so if you’re doing a hundred I/O per second then you know that that’s more than enough if if you’re doing 1 million I/O per second maybe you want a even smaller value because you want to sample you know very gently so it’s not going to impact you and by the way with all tracing you should turn it on and it should be you know a very small window you don’t want to leave it on forever it’s going to hurt you 

Right we have this slow query tracing and the way to enable slow query tracing is through the REST API we have the documentation here again by all means do your hands-on exercises but it’s basically you go to the API you see enable slow tracing and you use you also specify for how long so usually you know a couple of seconds 10/30 seconds no more than that because it’s going to get a lot of  stuff and it’s the type of tracing that you want to use when you’re clueless about what’s going on in your database maybe because you’re the database administrator and you don’t know what the developers are doing they’re just complaining about performance sometimes you just grab a couple of slow queries and you show to them and say hey you’re passing you know 16 million values in an in clause I saw that in a customer and well then you have a problem right don’t do that this is another example of CQL tracing we have two DCs and this one is a query just to the local DC so we’re probably using something like consistency level local quorum and when we are using quorum it’s using both DC so this is what a cross DC tracing would look like 

So tracing  is a costly operation. don’t enable it by default use tracing for small periods of time when in doubt ask the Scylla team tracing can be tricky so by all means we are always there slack you know email whatever if you don’t know if you should use some type of tracing for you know a particular troubleshooting that you are trying to do just come to us we advise you CQL tracing as any other client side tracing is great for a specific queries but might not give you the full picture slow query tracing is a great resource if you are clueless about which queries are impacting your cluster we do have other types of tracing but it’s reserved for advanced sessions you usually requested by a developer so you have a problem we are working with our engineering team sometimes they will ask you for more specific stuff very rarely used so no point in going here so admin procedures basically bootstrap a new node into a cluster you already had other sessions so I’m assuming you install Scylla with you know the repository you run Scylla set up you edit your yaml file right and then there you go you have a node running it’s the same steps so you just have to specify on your yaml that you have the same cluster name that your IPs are matching the right ports and there are right addresses and it’s going to contact your your seed and it’s going to join the cluster it’s it cannot be more straightforward than that and again I know we have limited time here this is the type of thing that you should do on the hands on session if by mistake let’s say you made a mistake on your yaml file for example you put the wrong cluster name and now the node it’s not going to join you just clear up the commit log in the data file and then you restart your node you should join because well people make mistakes that’s how you circumvent that problem that’s the process 

So that that is the new node it’s talking to the seed the seed is going to tell it the topology and it’s going to join it’s pretty straight forward after the new node joined, on all the other nodes you run nodetool clean up just to make sure because now the the token ranges were redistributed so you wanted the old nodes get rid of those token ranges that were transferred to the new node, remove a node we have three use cases hardware problems reduced cost lower demand so there are two cases that you want to remove a node one it’s because you’re planning ahead you’re removing your node for whatever reasons maybe you know you have a lower workload right now if the all the nodes are on a healthy state then you can just issued nodetool decommission it’s going to restream all the data back to the other nodes we distribute everything and then you’re done and then there’s the other case where you already lost that node right you had a harder problem you’ve had some kind of failure and then you can do two things you can remove that node for good so you run nodetool remove node to let the cluster know that that node is not coming back so it’s going to redistribute the token ranges and after that of course you run a repair or you can replace that node right again you had a node that you lost you’re going to bring another box another VM whatever and you’re going to replace that node 

It’s the same procedure that we used to bootstrap a node but with the added option of replace address so you’re going to get the IP from that node that you lost you’re going to replace that address and it’s going to join it’s going to string out all the data again every time you have a problem in your cluster so it’s wise to run a repair after that seed nodes are nothing special it’s just a point of entry to your cluster so but there are two things you you should have in mind regarding seeds first one is when you join your new node or when you’re replacing a node you never joined the cluster as a seed because then it’s not going to stream any any data so you should join as a regular node and after that after it’s up and normal then you can promote it to a seed the last one the last procedure is adding a data center so basically you are going to create a new data center on on your key spaces you you’re going to add that data center with replication 0 after you have all your nodes ready and you did everything as we mentioned before configured the Scylla yaml all that then you start those nodes so same procedure that I mentioned before you know installation Scylla setup configuring the yaml starting up after after the DC’s there you’re going to change the replication factor for that DC for whatever you want in this case is 3 but maybe you want a different replication factor on the new DC and then you’re going to run nodetool rebuild with the new DC name and the existing one and it’s going to stream all the data from the first DC and it’s going to build that DC again

I know it’s a lot of a lot of stuff we have very little time backups are done with nodetool snapshot it’s going to put it on that path that you see there for our lib/scylla/data then the key space name the table name slash snapshot they are all going to be there you can use nodetool list snapshots to see all the snapshots that you have clear snapshots if you want to get rid of the old ones if you want to restore one of the snapshots what you do is you clear the commit log you clear that the DB files for that particular table and you put the snapshots there and you restart the node optionally you could put it on the uploads directory and run nodetool refresh but again the documentation is there what else so I think that’s what I had for today I told you guys that it was going to be intensive

To report this post you need to login first.