Change Data Capture, or CDC, is a feature that allows you to not only query the current state of a database’s table but also to query the history of recent changes made to the table. CDC allows users to build streaming data pipelines that enable real-time data processing and analysis and immediately react to modifications occurring in the database.
Some of the topics covered in this lesson are:
- An overview of Change Data Capture, what exactly is it, what are some common use cases, what does it do, and an overview of how it works
- How can that data be consumed? Different options for consuming the data changes including normal CQL, a layered approach, and integrators
- How does CDC work under the hood? Covers an example of what happens in the DB on different operations to allow CDC
- A summary of CDC: It’s easy to integrate and consume, it uses plain CQL tables, it’s robust, it’s replicated in the same way as normal data, it has a reasonable overhead, it does not overflow if the consumer fails to act and data is TTL’ed. The summary also includes a comparison with Cassandra, DynamoDB, and MongoDB.
You can read more about CDC in the documentation and this blog post.
Important Note: as of Scylla Open Source Release 4.1, CDC is experimental. While functionally complete, we are still testing CDC to validate it is production-ready towards GA in a following Scylla 4.x release. No API updates are expected.