Secondary Indexes are a table containing a copy of the key of a base table, with a single partition key corresponding to the indexed column.
Indexes are created for one main purpose: to allow querying by a column that is not a key. CQL tables have strict schemas that define which columns form a primary key, and fundamentally we should use these keys to extract data from the database. But, in practice, we may want to occasionally query by a different, regular column, or several of them. How to achieve that? One of the ways is to create an index on that column.
Global indexes used to be the only kind that ScyllaDB supported, so it may be known to you already under the name “secondary indexes”. They’re easy to use and conceptually simple – we just pick a column to be indexed, create an index, and it just works!
This session covers examples, uses cases, what happens under the hood when Indexes are used.
Local secondary indexes are conceptually similar to global indexes, but there’s one important difference – local indexes guarantee that for each base partition, the corresponding rows in the materialized view will end up on the same node. It’s achieved by using the same partition key in the underlying view as we use in the base table.
The tricky part is querying – not all queries are eligible for local secondary indexes. In order to be able to benefit from a local index, we must provide a partition key restriction as well. As a result, only a single view partition will be queried, and what’s more, the base rows we need to fetch are on the exact same node as the materialized view partition. Hence the name, *local* indexes.
Transcript
Global secondary indexes, again
an index is a table containing a copy of the key of a base table with a single
partition key corresponding to the index column again when we say it like that
it’s kind of what but let’s let’s keep going you know it’ll be clear so here
we have a base table and we are creating not a materialized view we’re creating an
index but Moreno every index has an underlying materialized view yes correct
but again you remember that previously when we were looking at materialized
views your application when it was querying the data it needed to be aware
that there was a materialized view and query straight to the materialized view
in this case when you create an index then the application doesn’t need to be
aware because for all purposes it’s querying the in the base table right
and then ScyllaDB just do an implicit drawing in terms of it’s going to look
to the index it’s going to get the data from the base table and we’ll give
you back the information so this is an example for city so in this case we have
just a partition key and then we create an index which the city is the partition
key pretty simple both of those queries should work if you have the index and
notice that you know you’re querying buildings their, not building by
city you could if you want you can query the underlying materialized view
but I don’t see the point because it’s just easier you know to let your
application to think it’s querying the base table, syntax
I already provide the link on the previous one it’s all one document again
as I mentioned when you create an index it’s creating an underlying materialized view
this is the right path so when you insert into buildings it’s going same
thing that we saw for the materialized view right it’s going to go to the base
table ScyllaDB will create the corresponding record on the materialized
view but when we go to the read case it’s going to, when you query
for buildings and you restricting by city ScyllaDB detects okay I have an index
for this column so it’s going first the materialized view to find that particular
record using the city restriction and then it goes to the base table it doesn’t
implicit join and gives you back your information again we were talking about
global indexes and I want you to take a very good look at this right so
because we are indexing a regular column now we don’t have anymore the same
locality right because it’s it’s going to be a different token a different
token range because you’re using city so it will probably reside in a different
node so every time you are using a global index you know you’re going to
have two hops so this is going to have some overhead in performance right so
because of these two hops then how we can make the query a single hop right so
there’s two things to that and here’s now we are talking about
local secondary indexes so local secondary indexes are indexes that
contain the original partition key it’s as simple as that that’s the difference
global one you’re using any column or any regular column and you’re not
including the original partition key for local secondary index you’re
including that it’s that simple if it’s using the same the same partition will
have the same token if it has the same token it’s going to be the same node the
same shard you guys know the drill so in this case by the way so you have a
coordinator right, your query is hitting the coordinator let’s say it’s node
seven and then both of those queries that I mentioned before
the one that goes to the materialized view and the one that goes to the base table
are in the same node in this case node 2 right but there’s a way it can
be even better if yeah if you’re using a token aware driver the Java one or the
Go one because then your coordinator and your replicas are on the
same node so it gets much better than if you’re not using the shard aware driver
let me go back here again you know this is the example we are doing
right now it’s a restaurant you’re creating an index on you still use
location but you’re adding dish_type so that’s the difference keep that in mind
you’re still you are keeping the partition key from the base table that’s it
it’s not, try not to overcomplicate it