Streams Change data capture (CDC) via Debezium is liberation for your data: By capturing changes from the log files of the database, it enables a wide range of use cases such as reliable microservices data exchange, the creation of audit logs, invalidating caches and much more.
In this talk we're taking CDC to the next level by exploring the benefits of integrating Debezium with streaming queries via Kafka Streams. Come and join us to learn
* How to run low-latency, time-windowed queries on your operational data
* How to enrich audit logs with application-provided metadata
* How to materialize aggregate views based on multiple change data streams, ensuring transactional boundaries of the source database
We'll also show how to leverage the Quarkus stack for running your Kafka Streams applications on the JVM, as well as natively via GraalVM, many goodies included, such as its live coding feature for instant feedback during development, health checks, metrics and more.
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020
1. Change Data Capture Pipelines With
Debezium and Kafka Streams
Gunnar Morling
Software Engineer
@gunnarmorling
2. Recap: Debezium
What's Change Data Capture?
Use Cases
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
1
2
3
3. Gunnar Morling
Open source software engineer at Red Hat
Debezium
Quarkus
Hibernate
Spec Lead for Bean Validation 2.0
Other projects: Deptective, MapStruct
Java Champion
#Debezium @gunnarmorling
4. @gunnarmorling
Postgres
MySQL
Kafka Connect Kafka ConnectApache Kafka
DBZ PG
DBZ
MySQL
Search Index
ES
Connector
JDBC
Connector
ES
Connector
ISPN
Connector
Cache
Debezium
Enabling Zero-Code Data Streaming Pipelines
Data
Warehouse
#Debezium
20. @gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
21. @gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
22. @gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
23. @gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
Enriched Customer Events
#Debezium
37. Aggregate View Materialization
Awareness of Transaction Boundaries
Topic with BEGIN/END markers
Enable consumers to buffer all events
of one transaction
@gunnarmorling
{
"transactionId" : "571",
"eventType" : "begin transaction",
"ts_ms": 1486500577125
}
{
"transactionId" : "571",
"ts_ms": 1486500577691,
"eventType" : "end transaction",
"eventCount" : [
{
"name" : "dbserver1.inventory.order",
"count" : 1
},
{
"name" : "dbserver1.inventory.orderLine",
"count" : 5
}
]
}
BEGIN END
#Debezium
38. Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
39. Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
Debezium + Kafka
Streams =