Datacenter

7 min to complete

Datacenter

A datacenter is a physical facility that houses servers, storage and networking equipment. Cloud providers operate many data centers around the world in many countries.

ScyllaDB is designed not just to be fault tolerant in software, but also to understand your datacenter topology. For example, you may wish to place replica nodes on different racks in a datacenter to minimize the risk of a rack-specific network or power outage. ScyllaDB does so by using a mechanism called snitches.

A Rack is a metal frame used to hold hardware devices such as servers, hard disks, networking equipment and other electronic equipment. The reason ScyllaDB is aware of logical racks is to provide an additional layer of fault tolerance, in case one rack (or datacenter) malfunctions.

ScyllaDB also offers multi-datacenter replication. This means that two (or more) datacenters can distribute and share data between each other. Your replication strategy allows you to define how many datacenters you wish to replicate across, and what replication factors you will use in each datacenter.

Datacenters are generally configured as peer-to-peer, meaning there is no central authority, nor are there “primary/replication” hierarchical relationships between clusters.

Thereafter, data can be replicated between your clusters to support localized traffic for lowest latency and highest throughput, or for resiliency in case of an entire datacenter outage.

Hands-on: Setup a multi DC cluster

Now that we understand the basics of multi-dc we will setup a multi-datacenter cluster. There will be two datacenters, DC1 and DC2. Each one will have three nodes. We’ll have one keyspace with a different replication factor defined for each datacenter: “DC1:3, DC2:2”, ScyllaDB will store three copies of the data in DC1 and two copies in DC2.

Before starting the cluster, make sure the aio-max-nr value is high enough (1048576 or more). 

This parameter determines the maximum number of allowable Asynchronous non-blocking I/O (AIO) concurrent requests by the Linux Kernel, and it helps ScyllaDB perform in a heavy I/O workload environment.

Check the value: 

cat /proc/sys/fs/aio-max-nr

If it needs to be changed:

echo "fs.aio-max-nr = 1048576" >> /etc/sysctl.conf
sysctl -p /etc/sysctl.conf

Let’s spin up our ScyllaDB cluster and get started. Start by created the first datacenter. A Git repository has been updated to provide the ability to automatically set this up:

git clone https://github.com/scylladb/scylla-code-samples.git
cd scylla-code-samples/mms

Now the container can be built and run:

docker-compose up -d

After roughly 60 seconds, the first datacenter, DC1, will be created.

The second datacenter will be referred to as DC2. To bring up the second datacenter, simply run the docker-compose utility and reference the docker-compose-dc2.yml file:

docker-compose -f docker-compose-dc2.yml up -d

After about 60 seconds, you should be able to see DC1 and DC2 when running the “nodetool status” command:

docker exec -it scylla-node1 nodetool status

The multi DC cluster is up and running and now we can read/write data from it.

fa-angle-up