This lesson goes over an intro to Kafka and covers some basic concepts. Apache Kafka is an open-source distributed event streaming system. It allows you to:
- Ingest data from a multitude of different systems, such as databases, your services, or other software applications
- Store them for future reads
- Process and transform the incoming streams in real-time
- Consume the stored data stream
Some common use cases for Kafka are:
- Message broker (similar to RabbitMQ and others)
- “Glue” between different services in your system
- Replication of data between databases/services
- Real-time analysis of data (e.g., for fraud detection)
The Scylla Sink Connector is a Kafka Connect connector that reads messages from a Kafka topic and inserts them into Scylla. It supports different data formats (Avro, JSON).
It can scale across many Kafka Connect nodes. It has at-least-once semantics, and it periodically saves its current offset in Kafka.
The lesson also provides a brief overview of CDC. To learn more about CDC, check out this lesson.
The Scylla CDC Source Connector is a Kafka Connect connector that reads messages from a Scylla table (with Scylla CDC enabled) and writes them to a Kafka topic. It works seamlessly with standard Kafka converters (JSON, Avro). The connector can scale horizontally across many Kafka Connect nodes. Scylla CDC Source Connector has at-least-once semantics.
The lesson includes demos for quickly starting Kafka, using the Scylla Sink Connector, viewing changes on a table with CDC enabled, and downloading, installing, configuring, and using the Scylla CDC Source Connector.