In this lesson, you’ll see a sample application with monitoring and important metrics to track. You’ll learn about using prepared statements, the benefits of using them, and the value of using token aware drivers.
And when users approach us on the slack channel
or have questions the number one question we have did you install
scylla-grafana-monitoring? So if you haven’t please do but a lot of the users don’t do
it till they have an issue and my session is about why you should use
scylla-grafana-monitoring when you develop your application and what it
allows you or will allow you while you’re developing your application so
ScyllaDB exposes a lot of metrics hundreds of metrics a lot of them have to do with
how ScyllaDB ticks internal parts of. ScyllaDB how they interact with others
have to do with how ScyllaDB interacts with your hardware is it working well do
we have any queues filing up are we getting any delays from the hardware
another subset has to do with the cluster how is the cluster doing again
we’re running a distributed database so a query will end up in probably in more
than one node but a subset of those has to do with how good you’ve written your
application and how good you have set it up to work with ScyllaDB so my session
will be focused around these metrics and a bit about data
modelling, so for the sake of this session I’ve written a sample
application and the sample application is written in Go, it’s following the
Go CQL example so I didn’t create any new issues I just copy code
and extended it a bit and it’s normal when you start off you copy code from
somewhere and start augmenting it for whatever you want the question is did
you catch all the items that you should have fixed before you go on to
production and there is a cost when you catch it, if you catch those things
earlier on instead of building strings using prepared statements it’s much
easier to fix at the beginning, so again that’s my application but a side note, I’m
not a Go developer or to quote Henrik that made it Go-ish, it looks like a C++
developer hacked some coding Go just to show something so I apologize if it’s
not that slick to the Go developers but it’s already up and the sample
application is based on two different tables there is a tweets table in which
we insert the tweets and there is a timeline table into which we insert the
information the user is interested in, so if I’m following a user my timeline will
have the tweets coming for that user. We added an additional field called liked
in which I can mark with that I’m interested in or liked and again for
each user tweet we insert it into the tweets table and that is a statement
for each follower of a user we entered into the
tweets timeline of that follower in the sample application we have a single
query in which we’re reading from the timeline and reading only the top the
first 50 lines or the last 50 events. The setup, so the sample app starts with four
nodes, a single client, a single cluster and let’s get it running so I have here
a four node cluster running in docker not yet with docker compose but it will
be and let’s start the client
so I’ve created a new dashboard this is not yet live, it’s part of the sample app
will probably propagate it later to the default dashboards
but this dashboard shows how good your application is running, so starting
from the left let’s start tackle items and what we see here is only 1% of the
requests are prepared statements, so I’m going to stop this and to make it easy
for me can we switch over to your display please, and the pointer is not
working, so what I show here is that 1% of the requests are prepared statements
and then you have the numbers so let’s fix that and to get the gauge into green
100% of your request needs to be a prepared statement and why is that?
So, when you’re sending a string to ScyllaDB which holds the request, ScyllaDB needs to
parse it, it needs to figure out which table you’re accessing, it needs to
find the columns that you are inserting in into, it needs to validate that
the query is correct, it needs to extract the values from the columns so
all of this repetitive work is something that is happening in each request and
that is saved when you’re doing prepared statements, this parsing and building of
the template is done only once for the prepared statement and afterwards you
can reuse that, in addition to that again if you’re after getting the best
performance that you can, you’re actually hurting yourself this additional
processing is costing you time so you’re doing processing and then doing
the statement itself and that adds latency, on top of it it doesn’t allow us
to do optimal routing, so in step 2 of the application we’re changing, we’re
changing the prepared statement from being a simple built string into a
prepared statement in which we’re building the template and
then passing the arguments, in Go CQL that the simplest form I found
So instead of switching back and forth and for the essence of this talk only
the first step I’ll do on my computer, the rest are screenshots, so when we fix
this we got the first gauge prepared statements into 100% then we have
the token aware and that is only at 74% so out of my request being sent 26%
are not hitting the correct coordinator and if we want to look
at this the client is sending a prepared statement but it’s not sending it to the
right node it’s sending it to coordinator that is not a replica
that node that receive the request doesn’t have the data it needs to relay
that request to other nodes in the cluster and that’s of course an
additional hop an additional unneeded processing, so in step 3 of the sample
application we are fixing this and the way to do it in Go CQL is to define a
whole selection policy and we have used a token aware host policy that is
defined and once we set this up and we run step three everything is green, great
so we’re very happy it was very simple we fixed two small issues