Change Data Capture (CDC)

S201: Data Modeling and Application Development Change Data Capture (CDC)

Change Data Capture, or CDC, is a feature that allows you to not only query the current state of a database’s table but also to query the history of recent changes made to the table. CDC allows users to build streaming data pipelines that enable real-time data processing and analysis and immediately react to modifications occurring in the database.

Some of the topics covered in this lesson are:

An overview of Change Data Capture, what exactly is it, what are some common use cases, what does it do, and an overview of how it works
How can that data be consumed? Different options for consuming the data changes including normal CQL, a layered approach, and integrators
How does CDC work under the hood? Covers an example of what happens in the DB on different operations to allow CDC
A summary of CDC: It’s easy to integrate and consume, it uses plain CQL tables, it’s robust, it’s replicated in the same way as normal data, it has a reasonable overhead, it does not overflow if the consumer fails to act and data is TTL’ed. The summary also includes a comparison with Cassandra, DynamoDB, and MongoDB.

You can read more about CDC in the documentation and this blog post.

It’s also recommended to run the Apache Kafka and ScyllaDB CDC lab after this lesson.