Join us at ScyllaDB LIVE, instructor-led NoSQL training sessions | June 18
Register now


5 min to complete


A Keyspace is a top-level container that stores tables with attributes that define how data is replicated on nodes. It defines a number of options that apply to all the tables it contains, most prominently of which is the replication strategy used by the Keyspace.

A keyspace is comparable to the concept of a database Schema in the relational world.  Since the keyspace defines the replication factor of all underlying tables, if we have tables that require different replication factors, we would store them in different keyspaces.

To create a Keyspace, we use the CREATE KEYSPACE command (don’t do this yet, hands-on example later on).

CREATE KEYSPACE Pets_Clinic WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor' : 1};

The Replication Strategy determines on which nodes replicas are placed.  More about Replication Strategies here.

Data replication ensures there is no single point of failure. Replication means storing copies of data on multiple nodes. This means that even if one node goes down, the data will still be available. It ensures reliability and fault tolerance. The number of copies of the data is defined by the Replication Factor.

A replication factor of 3 (RF=3), as in the example above, means that 3 copies of the data are stored at all times. Depending on the RF a user sets for the keyspace, the coordinator node will then share the data with other nodes, called replicas, to create copies of the data for fault tolerance.

More information about high availability,  replication, and consistency can be found in this lesson.

To summarize, Keyspaces contain Tables that contain data.

This is also covered in depth in the docs website.


Note that in addition to the instructions provided in this lesson, from this point onwards, for running ScyllaDB on a machine with Docker, you can find the hands-on part of this lesson in the Killercoda learning environment here. The Killercoda environment provides an interactive virtual machine where you can execute all the commands directly from your browser without the need to configure anything.

Before starting the cluster, make sure the aio-max-nr value is high enough (1048576 or more). 

This parameter determines the maximum number of allowable Asynchronous non-blocking I/O (AIO) concurrent requests by the Linux Kernel, and it helps ScyllaDB perform in a heavy I/O workload environment.

Check the value: 

cat /proc/sys/fs/aio-max-nr

If it needs to be changed:

echo "fs.aio-max-nr = 1048576" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf

Start a single instance and call it ScyllaU:

docker run --name scyllaU -d scylladb/scylla:5.2.0

Notice that some files might be downloaded in this step. After waiting for a few seconds, we’ll verify that the cluster is up and running with the Nodetool Status command:

docker exec -it scyllaU nodetool status

The node scyllaU has a UN status. U means up, and N means normal. Read more about Nodetool Status Here.

Finally, we use the CQL Shell to interact with ScyllaDB:

docker exec -it scyllaU cqlsh

The CQL Shell allows us to run Cassandra Query Language commands on ScyllaDB. Now create the Keyspace:

CREATE KEYSPACE Pets_Clinic WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor' : 1};

Now identify the context for the next operations as the created keyspace:

use Pets_Clinic;