Join us at ScyllaDB University Live, instructor-led, online, hands-on, training sessions | January 29
Register now

Lab – Migrating into ScyllaDB Cloud

13 min to complete

Using SSTableLoader to migrate from Cassandra into ScyllaDB

Note: throughout this lesson, we’re using a variable scylla_ip, which is your ScyllaDB Cloud instance IP. For more information on getting started with ScyllaDB Cloud, look into the Getting Started with ScyllaDB Cloud lab
Create a docker-compose.yaml file with the following contents:

Access the Ubuntu container shell to install the required tools:

docker exec -it ubuntu-loader bash

Prepare the Ubuntu loader instance with the following commands:

apt update; apt install -y gnupg curl
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 5e08fbd8b5d6ec9c
curl -L --output /etc/apt/sources.list.d/scylla.list http://downloads.scylladb.com/deb/ubuntu/scylla-2021.1.list
apt update; apt install -y scylla-enterprise-tools

Then exit the container with:

exit

Create a file named ‘data/shared/schema.cql’ with the following contents:

Create a file named ‘data/shared/cass/write.cql’ with the following contents:

Create the schema on Cassandra:

docker-compose exec cassandra-migration-node1 cqlsh -f /shared/schema.cql

Write data onto Cassandra:

docker-compose exec cassandra-migration-node1 cqlsh -f /shared/cass/write.cql

Snapshot the “data” keyspace on Cassandra:

docker-compose exec cassandra-migration-node1 nodetool snapshot -t ScyllaDBMigration data;

Copy data from Cassandra into the current node:

docker-compose exec cassandra-migration-node1 find /var/lib/cassandra/data/data/ -name snapshots -exec cp -rp '{}' /shared/cass/ \;

Log into ScyllaDB and inspect the keyspace and tables. They should not be there:

cqlsh ${scylla_ip}
DESC data;
SELECT COUNT(*) FROM data.tbl;

Create the Schema in ScyllaDB:

cqlsh -f data/shared/schema.cql ${scylla_ip}

Link the data directory following the keyspace/table format:

docker-compose exec ubuntu-migration-processor bash -c 'ln -s /shared/cass/snapshots /tmp/data/tbl'

Finally, run the sstableloader command, informing the ScyllaDB IP and directory structure containing the data:

docker-compose exec ubuntu-migration-processor sstableloader --nodes ${scylla_ip} /tmp/data/tbl

After inserting the data, issue a count query for the ScyllaDB database:

cqlsh ${scylla_ip} -e 'SELECT COUNT(*) FROM data.tbl;'

Using ScyllaDB Migrator to migrate from Amazon DynamoDB into ScyllaDB Alternator

Create the JSON file ‘batch-write-items.json’ for later writing into DynamoDB:

Create the table on the DynamoDB instance:

aws dynamodb create-table --table-name migtest --attribute-definitions AttributeName=City,AttributeType=S AttributeName=Date,AttributeType=S --key-schema AttributeName=City,KeyType=HASH AttributeName=Date,KeyType=RANGE --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 --endpoint-url http://172.98.0.5:8000

Load items from the ‘batch-write-items.json’ file into DynamoDB:

aws dynamodb batch-write-item --request-items file://batch-write-items.json --endpoint-url http://172.98.0.5:8000

You can query DynamoDB for a single item to ensure it’s stored:

aws dynamodb scan --table-name migtest --filter-expression "City = :name" --expression-attribute-values '{":name":{"S":"New York"}}' --endpoint-url http://172.98.0.5:8000

Now let’s query ScyllaDB to ensure the data is not there:

docker exec -it container_scylla-migration-node1_1 cqlsh -e 'SELECT * FROM alternator_migtest.migtest;'

Get the latest version of ScyllaDB Migrator from the repo.

Edit the config.yaml configuration file to match your DynamoDB source and Alternator target.

Build and configure ScyllaDB Migrator following the steps on the Project’s page.

Then start the migrator job pointing to the configuration file:

docker-compose exec spark-master /spark/bin/spark-submit --class com.scylladb.migrator.Migrator --master spark://spark-master:7077 --conf spark.driver.host=spark-master --conf spark.scylla.config=/app/config.yaml /jars/scylla-migrator-assembly-0.0.1.jar

After the migration job is finished, query the data in ScyllaDB:

docker exec -it container_scylla-migration-node1_1 cqlsh -e 'SELECT * FROM alternator_migtest.migtest;'

 

fa-angle-up