12 min to complete
Setup details
The setup we used on AWS Cloud:
- 3 x Scylla Nodes – i3.4xlarge
- 1 x Monitor Node – t3.large
- 8 x Loader Nodes – c4.2xlarge
Prerequisites
- monitoring should run the ScyllaDB monitoring stack.
- Loaders should have ‘cassandra-stress’ installed – preferably one that uses scylla drivers.
- Nodes should have authentication and authorization enabled (don’t forget to change the system_auth replication factor to 3).
Configure the ScyllaDB node
- Create the roles and service levels and attach them via cqlsh:
- CREATE ROLE OLTP1 WITH password=’OLTP1’ and superuser = true;
- CREATE ROLE OLTP2 WITH password=’OLTP2’ and superuser = true;
- CREATE ROLE OLAP WITH password=’OLAP’ and superuser = true;
- CREATE SERVICE LEVEL OLTP1 WITH workload_type=’interactive’ AND timeout = 20ms AND shares = 600;
- CREATE SERVICE LEVEL OLTP2 WITH workload_type=’interactive’ AND timeout = 2s AND shares = 300;
- CREATE SERVICE LEVEL OLAP WITH workload_type=’batch’ AND timeout = 20s AND shares = 100;
- Wait for 10 seconds
- ATTACH SERVICE LEVEL OLTP1 TO OLTP1;
- ATTACH SERVICE LEVEL OLTP2 TO OLTP2;
- ATTACH SERVICE LEVEL OLAP TO OLAP
- The superuser in the roles is there to not have to deal with granting those roles access to the table.
Preloading The Cluster With 1TB
First, define a variable:
- export NODES=’<comma delimited list of nodes ips>’
Run the following cassandra-stress commands (each on a different loader):
- cassandra-stress write no-warmup cl=ALL n=250000000 -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native -rate threads=200 -col ‘size=FIXED(1024) n=FIXED(1)’
-pop seq=1..250000000 -node $NODES - cassandra-stress write no-warmup cl=ALL n=250000000 -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native -rate threads=200 -col ‘size=FIXED(1024) n=FIXED(1)’
-pop seq=250000001..500000000 -node $NODES - cassandra-stress write no-warmup cl=ALL n=250000000 -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native -rate threads=200 -col ‘size=FIXED(1024) n=FIXED(1)’
-pop seq=500000001..75000000 -node $NODES - cassandra-stress write no-warmup cl=ALL n=250000000 -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native -rate threads=200 -col ‘size=FIXED(1024) n=FIXED(1)’
-pop seq=750000001..1000000000 -node $NODES
Running the test itself
Run each of the following commands on every loader
- OLTP1: cassandra-stress mixed no-warmup cl=QUORUM duration=180m -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native user=’OLTP1′ password=’OLTP1′ -rate ‘threads=200 throttle=11000/s’ -col ‘size=FIXED(1024) n=FIXED(1)’ -pop ‘dist=gauss(1..1000000000,500000,50000)’ -errors ignore -node $NODES
- OLTP2: cassandra-stress mixed no-warmup cl=QUORUM duration=180m -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native user=’OLTP2′ password=’OLTP2′ -rate ‘threads=200 throttle=11000/s’ -col ‘size=FIXED(1024) n=FIXED(1)’ -pop ‘dist=gauss(1..1000000000,500000,50000)’ -errors ignore -node $NODES
- OLAP: cassandra-stress mixed no-warmup cl=QUORUM duration=180m -schema ‘replication(strategy=NetworkTopologyStrategy,replication_factor=3)’ -mode cql3 native user=’OLAP’ password=’OLAP’ -rate ‘threads=40’ -col ‘size=FIXED(1024) n=FIXED(1)’ -pop ‘dist=gauss(1..1000000000,500000000,500000000)’ -errors ignore -node $NODES
Explanation for the stress command parameters:
- The two OLTP commands have a high thread count, meaning high concurrency. They also have a pretty narrow distribution of the values because, typically, the values that interactive loads use have some tren to them, meaning many users will require about the same data. The rate is limited, and the rate of requests is constant to simulate a rate of requests unrelated to the concurrency. So after running the commands, the OLTP loads will have a concurrency of 1600 and a rate of 88Kops/s
- The OLAP command has a low thread count, meaning bounded concurrency. It has a wider distribution over the data because analytical computations typically consume a large portion of the data; they are also not rate-limited. So the OLAP workload has a concurrency of 240
Monitoring
To restore the monitoring shown in the demo, start a ScyllaDB monitoring stack for the master branch, then open the monitoring and go to the “Overview” dashboard. Click on the gear (settings), and on the left pane, go to JSON. Replace the content of it with the content from the file below.
Then click save and refresh the page to get the demo’s dashboard instead of the overview dashboard.