How do we manage identities with users? The lesson covers Identity, Authentication, Users and passwords, and Availability.
And the first area I
want to focus on is managing users and identity, so what is identity? – identity is
broadly speaking the property that we want to know who is
interfacing with our system and in the case of ScyllaDB that means very simply
who is interfacing with the database and another important property is
authentication and again broadly speaking in security authentication is
the concept that we want to verify the truth of something, some information that
is provided by somebody who has been identified and in the case of ScyllaDB
what we want to know is can we trust that this user who is presenting a
certain identity to us is in fact who they say they are. How can you verify the
truth of an identity, and in ScyllaDB these concepts are exemplified through users
and passwords as they are in many many security systems that we use today and
so in ScyllaDB we can manage users directly through CQL
and in this example I’ve created a Joe. Smith user with a certain password a
very secure password and I can identify myself to ScyllaDB
with this username and we’re all very familiar with that at least if not in
ScyllaDB more broadly with lots of other systems that we use. In ScyllaDB the
way we store this data is in a table in. ScyllaDB itself we like to be recursive
like that, so we have a nice table for users and we store information and when
you log into ScyllaDB we query that table to identify whether or not you’ve been
authenticated in the system and I know this is like the review for most of you
but the way we actually manage passwords and store them securely is through
what’s called a one-way cryptographic hash function and these are mathematical
functions that have the property that given a particular input that they
produce an output usually a stream of bytes, they have the property that
pragmatically it’s impossible to determine what the input was given
a known output, so if I have a string dead beef in hex it’s it’s pragmatically
impossible within the bounds of computing that we have today to identify
what password produced that sequence of bytes, and to implement password
management on ScyllaDB we just used the crypt_r function from the GNU C
library and this is a very historically stable function in series of
functionality and crypt_r supports lots of different cryptographic
hashing functions and schemes for doing that hashing, one of the most
conventionally accepted and recommended schemes is called bcrypt but
unfortunately bcrypt is often not supported on Linux distributions like Ubuntu
and Fedora and others, so we fall back to SHA-512 most of the time
I think I wanted to talk about one more thing let me go backwards, yes
additionally I want to emphasize that we add “salt” to our passwords and for
those of you who aren’t familiar with that, the reason is that given only a
cryptographic hash function and and nothing else, an attacker could
actually just collect a large database of common passwords and the resulting
cryptographic hash, because we advertise the function that we use and they could
just compare those directly instead of bothering to guess the actual password
so what we do is we add some extra randomly generated stuff to the password
before we hash it, which renders these pre-computed outputs totally useless and
we do all this through crypt_r as well.. So I want to discuss also this concept
of availability, which is actually a critically important security property
that a lot of people don’t necessarily consider which is that our systems exist
for a reason and if we can’t use them they’re effectively useless and you’re
all probably familiar with denial of service attacks or a DDoS – distributed
denial of service attacks and what these do is they overload systems rendering
them unusable to serve regular requests from users and again if you have a
building for example of secure doors and nobody can walk in your
building, well you can’t do business and you can imagine how this extends to lots
of different things maybe your own business and so this idea that a system
should be available to do the thing it’s designed to do is a very important one
We address this in ScyllaDB with users by replicating the user metadata, and we
recommend that you replicate it based on. N – the number of nodes you have which
means that if a particular node is down other users can still log into nodes in
the system and access the database, you’re not going to be locked out of the
system and lastly I want to talk about. Identity Management in enterprises
As I said we manage all of this information in ScyllaDB directly but frequently in
large companies, they use these external Identity
Management or directory service tools and this is very important but
boring on the surface stuff that allows you to manage
your employees email addresses and contact information and groups they
belong to or not this kind of stuff and often they’re very expansive, important
systems and we want to be able to integrate with these systems so that you
don’t have to recreate all that information in ScyllaDB and a very common
way to interface with these kinds of systems is LDAP lightweight directory
protocol and we intend in a future version of the Enterprise Edition of
ScyllaDB to support this feature and I’m actually actively working on integrating
LDAP with Seastar, so this is how we envision it working, effectively a client
whether it’s a driver in code or somebody’s typing CQLSH will provide
an ID and password to ScyllaDB and. ScyllaDB will take that ID and from it
based on some user configurable rules execute a query to your directory
service your DS and the directory service will execute that query based on
what ScyllaDB pass to it and hopefully it’ll result in a single
unique entry corresponding to one person and given that entry it gets called a DN
in LDAP, ScyllaDB will then attempt to bind or authenticate the DN with the
password that was provided and if the directory service accepts it then ScyllaDB
will acknowledge that these are authenticated correctly and one
important thing to note about the scheme is like systems which provide
access via LDAP this means that ScyllaDB would be dependent on the availability
of your directory service and that could be an interesting constraint depending
on your environment but likely you’re not employing these kinds of systems
unless you already have expertise at managing them.