Change Data Capture for a Brand New World - Hannu Valtonen

Discover the latest advancements in Change Data Capture (CDC) for real-time data replication, high scalability, and flexible use cases, including data warehousing, reporting, and data science.

Key takeaways
  • Change Data Capture (CDC) is a way to track data changes in a database and replicate those changes to another location.
  • CDC has been around for a while, with the first solutions appearing in the early 2000s.
  • Historically, CDC was done through full database dumps every night, but this approach has several limitations.
  • Modern CDC solutions allow for real-time data replication and can handle high volumes of data.
  • One of the most popular CDC solutions is Postgres CDC, which uses a combination of the pg and pg_logical tools.
  • Postgres CDC can be used to replicate data to a variety of destinations, including Kafka, Kinesis, and Pub/Sub.
  • CDC can be used in a variety of use cases, including data warehousing, real-time reporting, and data science.
  • CDC can also be used to migrate data from one database to another.
  • One of the key benefits of CDC is that it allows for real-time data replication, which can be useful for a variety of applications.
  • CDC is also highly scalable and can handle high volumes of data.
  • Some of the limitations of CDC include the need for careful planning and implementation, as well as the potential for high system overhead.
  • CDC is a powerful tool that can be used in a variety of different ways, depending on the specific needs of the user.