Join us for Scylla University LIVE, 28th of July
Learn More >

Kafka and Scylla

This lesson goes over an intro to Kafka and covers some basic concepts. Apache Kafka is an open-source distributed event streaming system. It allows you to:

  • Ingest data from a multitude of different systems, such as databases, your services, or other software applications
  • Store them for future reads
  • Process and transform the incoming streams in real-time
  • Consume the stored data stream

Some common use cases for Kafka are:

  • Message broker (similar to RabbitMQ and others)
  • “Glue” between different services in your system
  • Replication of data between databases/services
  • Real-time analysis of data (e.g., for fraud detection)

The Scylla Sink Connector is a Kafka Connect connector that reads messages from a Kafka topic and inserts them into Scylla. It supports different data formats (Avro, JSON).
It can scale across many Kafka Connect nodes. It has at-least-once semantics, and it periodically saves its current offset in Kafka.

The lesson also provides a brief overview of CDC. To learn more about CDC, check out this lesson.

The Scylla CDC Source Connector is a Kafka Connect connector that reads messages from a Scylla table (with Scylla CDC enabled) and writes them to a Kafka topic. It works seamlessly with standard Kafka converters (JSON, Avro). The connector can scale horizontally across many Kafka Connect nodes. Scylla CDC Source Connector has at-least-once semantics.

The lesson includes demos for quickly starting Kafka, using the Scylla Sink Connector, viewing changes on a table with CDC enabled, and downloading, installing, configuring, and using the Scylla CDC Source Connector.

Also, check out the documentation, the blog post Introducing the Kafka Scylla Connector, the Scylla Sink Connector GitHub project, and the Scylla CDC Source Connector GitHub project.