SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Change Data Capture Pipelines With
Debezium and Kafka Streams
Gunnar Morling
Software Engineer
@gunnarmorling
Recap: Debezium
What's Change Data Capture?
Use Cases
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
1
2
3
Gunnar Morling
Open source software engineer at Red Hat
Debezium
Quarkus
Hibernate
Spec Lead for Bean Validation 2.0
Other projects: Deptective, MapStruct
Java Champion
#Debezium @gunnarmorling
@gunnarmorling
Postgres
MySQL
Kafka Connect Kafka ConnectApache Kafka
DBZ PG
DBZ
MySQL
Search Index
ES
Connector
JDBC
Connector
ES
Connector
ISPN
Connector
Cache
Debezium
Enabling Zero-Code Data Streaming Pipelines
Data
Warehouse
#Debezium
@gunnarmorling
Debezium
Low-Latency Change Data Streaming
https://medium.com/convoy-tech/
#Debezium
@gunnarmorling
Debezium
Low-Latency Change Data Streaming
https://medium.com/convoy-tech/
#Debezium
@gunnarmorling
CDC – "Liberation for Your Data"
#Debezium
@gunnarmorling
CDC – "Liberation for Your Data"
#Debezium
{
"before": null,
"after": {
"id": 1004,
"first_name": "Anne",
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"server_id": 0,
"ts_sec": 0,
"file": "mysql-bin.000003",
"pos": 154,
"row": 0,
"snapshot": true,
"db": "inventory",
"table": "customers"
},
"op": "c",
"ts_ms": 1486500577691
}
Change Event Structure
Key: Primary key of table
Value: Describing the change event
Old row state
New row state
Metadata
@gunnarmorling#Debezium
Recap: Debezium
What's Change Data Capture?
Use Cases
1
2
3
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
@gunnarmorling
Quarkus
Supersonic Subatomic Java
“ A Kubernetes Native Java stack tailored for
OpenJDK HotSpot and GraalVM, crafted from the
best of breed Java libraries and standards.
#Debezium
@gunnarmorling#Debezium
@gunnarmorling#Debezium
Quarkus
Supersonic Subatomic Java
Developer joy
Imperative and Reactive
Best-of-breed libraries
@gunnarmorling#Debezium
Quarkus
Supersonic Subatomic Java
Developer joy
Imperative and Reactive
Best-of-breed libraries
Quarkus
The Kafka Streams Extension
Management of topology
Health checks
Dev Mode
Support for native binaries via GraalVM
@gunnarmorling#Debezium
Recap: Debezium
What's Change Data Capture?
Use Cases
1
3
2
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
Data Enrichment -
Demo
Auditing
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
CRM
Service
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
Enriched Customer Events
#Debezium
@gunnarmorling
Auditing
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
Customers
#Debezium
@gunnarmorling
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
{
"before": null,
"after": {
"id": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"source": {
"name": "dbserver1",
"table": "transactions",
"txId": "tx-3"
},
"op": "c",
"ts_ms": 1486500577691
}
Transactions Customers
{
"id": "tx-3"
}
#Debezium
{
"id": "tx-3"
}
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
Transactions Customers
@gunnarmorling
{
"before": null,
"after": {
"id": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"source": {
"name": "dbserver1",
"table": "transactions",
"txId": "tx-3"
},
"op": "c",
"ts_ms": 1486500577691
}
#Debezium
@gunnarmorling
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"op": "u",
"ts_ms": 1486500577691
}
Enriched Customers
Auditing
#Debezium
@gunnarmorling
@Override
public KeyValue<JsonObject, JsonObject>
transform(JsonObject key, JsonObject value) {
boolean enrichedAllBufferedEvents =
enrichAndEmitBufferedEvents();
if (!enrichedAllBufferedEvents) {
bufferChangeEvent(key, value);
return null;
}
KeyValue<JsonObject, JsonObject> enriched =
enrichWithTxMetaData(key, value);
if (enriched == null) {
bufferChangeEvent(key, value);
}
return enriched;
}
Auditing
Non-trivial join implementation
no ordering across topics
need to buffer change events
until TX data available
bit.ly/debezium-auditlogs
#Debezium
Aggregate
View Materalization
Aggregate View Materialization
From Multiple Topics to One View
@gunnarmorling#Debezium
PurchaseOrder
OrderLine
{
"purchaseOrderId" : "order-123",
"orderDate" : "2020-08-24",
"customerId": "customer-123",
"orderLines" : [
{
"orderLineId" : "orderLine-456",
"productId" : "product-789",
"quantity" : 2,
"price" : 59.99
},
{
"orderLineId" : "orderLine-234",
"productId" : "product-567",
"quantity" : 1,
"price" : 14.99
}
1
n
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Awareness of Transaction Boundaries
TX metadata in change events (e.g. dbserver1.inventory.orderline)
{
"before": null,
"after": { ... },
"source": { ... },
"op": "c",
"ts_ms": "1580390884335",
"transaction": {
"id": "571",
"total_order": "1",
"data_collection_order": "1"
}
}
Aggregate View Materialization
Awareness of Transaction Boundaries
Topic with BEGIN/END markers
Enable consumers to buffer all events
of one transaction
@gunnarmorling
{
"transactionId" : "571",
"eventType" : "begin transaction",
"ts_ms": 1486500577125
}
{
"transactionId" : "571",
"ts_ms": 1486500577691,
"eventType" : "end transaction",
"eventCount" : [
{
"name" : "dbserver1.inventory.order",
"count" : 1
},
{
"name" : "dbserver1.inventory.orderLine",
"count" : 5
}
]
}
BEGIN END
#Debezium
Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
Debezium + Kafka
Streams =
Resources
Website:
Examples:
Latest news: @debezium
debezium.io
github.com/debezium/debezium-examples/
@gunnarmorling#Debezium
gunnar@hibernate.org
@gunnarmorling
@gunnarmorling
Q&A
#Debezium
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020

Weitere Àhnliche Inhalte

Was ist angesagt?

Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
HostedbyConfluent
 
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
confluent
 

Was ist angesagt? (20)

Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
So You’ve Inherited Kafka? Now What? (Alon Gavra, AppsFlyer) Kafka Summit Lon...
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
How to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thriveHow to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thrive
 
Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Exp...
Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Exp...Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Exp...
Apache Pulsar at Tencent Game: Adoption, Operational Quality Optimization Exp...
 
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and Debezium
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
 
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
Kafka Pluggable Authorization for Enterprise Security (Anna Kepler, Viasat) K...
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 

Ähnlich wie Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020

Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
MongoDB
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
confluent
 

Ähnlich wie Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020 (20)

LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGate
 
Oracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationOracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub Implementation
 
Event streaming webinar feb 2020
Event streaming webinar feb 2020Event streaming webinar feb 2020
Event streaming webinar feb 2020
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
Scaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFiScaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFi
 
Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
 
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
 
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
 
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
 
PostgreSQL - ĐŒĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ĐŸĐČĐ°ĐœĐžĐ” ĐČ ĐŒĐŸĐŽĐ”, Valentine Gogichashvili (Zalando SE)
PostgreSQL - ĐŒĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ĐŸĐČĐ°ĐœĐžĐ” ĐČ ĐŒĐŸĐŽĐ”, Valentine Gogichashvili (Zalando SE)PostgreSQL - ĐŒĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ĐŸĐČĐ°ĐœĐžĐ” ĐČ ĐŒĐŸĐŽĐ”, Valentine Gogichashvili (Zalando SE)
PostgreSQL - ĐŒĐ°ŃŃˆŃ‚Đ°Đ±ĐžŃ€ĐŸĐČĐ°ĐœĐžĐ” ĐČ ĐŒĐŸĐŽĐ”, Valentine Gogichashvili (Zalando SE)
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at ScaleGetting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
 
Dynamo DB Indexes
Dynamo DB IndexesDynamo DB Indexes
Dynamo DB Indexes
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 

Mehr von HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

Mehr von HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

KĂŒrzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020

  • 1. Change Data Capture Pipelines With Debezium and Kafka Streams Gunnar Morling Software Engineer @gunnarmorling
  • 2. Recap: Debezium What's Change Data Capture? Use Cases Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation 1 2 3
  • 3. Gunnar Morling Open source software engineer at Red Hat Debezium Quarkus Hibernate Spec Lead for Bean Validation 2.0 Other projects: Deptective, MapStruct Java Champion #Debezium @gunnarmorling
  • 4. @gunnarmorling Postgres MySQL Kafka Connect Kafka ConnectApache Kafka DBZ PG DBZ MySQL Search Index ES Connector JDBC Connector ES Connector ISPN Connector Cache Debezium Enabling Zero-Code Data Streaming Pipelines Data Warehouse #Debezium
  • 5. @gunnarmorling Debezium Low-Latency Change Data Streaming https://medium.com/convoy-tech/ #Debezium
  • 6. @gunnarmorling Debezium Low-Latency Change Data Streaming https://medium.com/convoy-tech/ #Debezium
  • 7. @gunnarmorling CDC – "Liberation for Your Data" #Debezium
  • 8. @gunnarmorling CDC – "Liberation for Your Data" #Debezium
  • 9. { "before": null, "after": { "id": 1004, "first_name": "Anne", "last_name": "Kretchmar", "email": "annek@noanswer.org" }, "source": { "name": "dbserver1", "server_id": 0, "ts_sec": 0, "file": "mysql-bin.000003", "pos": 154, "row": 0, "snapshot": true, "db": "inventory", "table": "customers" }, "op": "c", "ts_ms": 1486500577691 } Change Event Structure Key: Primary key of table Value: Describing the change event Old row state New row state Metadata @gunnarmorling#Debezium
  • 10. Recap: Debezium What's Change Data Capture? Use Cases 1 2 3 Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension
  • 11. @gunnarmorling Quarkus Supersonic Subatomic Java “ A Kubernetes Native Java stack tailored for OpenJDK HotSpot and GraalVM, crafted from the best of breed Java libraries and standards. #Debezium
  • 13. @gunnarmorling#Debezium Quarkus Supersonic Subatomic Java Developer joy Imperative and Reactive Best-of-breed libraries
  • 14. @gunnarmorling#Debezium Quarkus Supersonic Subatomic Java Developer joy Imperative and Reactive Best-of-breed libraries
  • 15. Quarkus The Kafka Streams Extension Management of topology Health checks Dev Mode Support for native binaries via GraalVM @gunnarmorling#Debezium
  • 16. Recap: Debezium What's Change Data Capture? Use Cases 1 3 2 Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension
  • 19. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events CRM Service #Debezium
  • 20. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events CRM Service Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 21. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 22. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Kafka Streams Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 23. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Kafka Streams Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table Enriched Customer Events #Debezium
  • 24. @gunnarmorling Auditing { "before": { "id": 1004, "last_name": "Kretchmar", "email": "annek@example.com" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "annek@noanswer.org" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } Customers #Debezium
  • 25. @gunnarmorling { "before": { "id": 1004, "last_name": "Kretchmar", "email": "annek@example.com" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "annek@noanswer.org" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } { "before": null, "after": { "id": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "source": { "name": "dbserver1", "table": "transactions", "txId": "tx-3" }, "op": "c", "ts_ms": 1486500577691 } Transactions Customers { "id": "tx-3" } #Debezium
  • 26. { "id": "tx-3" } { "before": { "id": 1004, "last_name": "Kretchmar", "email": "annek@example.com" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "annek@noanswer.org" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } Transactions Customers @gunnarmorling { "before": null, "after": { "id": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "source": { "name": "dbserver1", "table": "transactions", "txId": "tx-3" }, "op": "c", "ts_ms": 1486500577691 } #Debezium
  • 27. @gunnarmorling { "before": { "id": 1004, "last_name": "Kretchmar", "email": "annek@example.com" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "annek@noanswer.org" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "op": "u", "ts_ms": 1486500577691 } Enriched Customers Auditing #Debezium
  • 28. @gunnarmorling @Override public KeyValue<JsonObject, JsonObject> transform(JsonObject key, JsonObject value) { boolean enrichedAllBufferedEvents = enrichAndEmitBufferedEvents(); if (!enrichedAllBufferedEvents) { bufferChangeEvent(key, value); return null; } KeyValue<JsonObject, JsonObject> enriched = enrichWithTxMetaData(key, value); if (enriched == null) { bufferChangeEvent(key, value); } return enriched; } Auditing Non-trivial join implementation no ordering across topics need to buffer change events until TX data available bit.ly/debezium-auditlogs #Debezium
  • 30. Aggregate View Materialization From Multiple Topics to One View @gunnarmorling#Debezium PurchaseOrder OrderLine { "purchaseOrderId" : "order-123", "orderDate" : "2020-08-24", "customerId": "customer-123", "orderLines" : [ { "orderLineId" : "orderLine-456", "productId" : "product-789", "quantity" : 2, "price" : 59.99 }, { "orderLineId" : "orderLine-234", "productId" : "product-567", "quantity" : 1, "price" : 14.99 } 1 n
  • 31. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 32. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 33. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 34. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 35. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 36. @gunnarmorling#Debezium Aggregate View Materialization Awareness of Transaction Boundaries TX metadata in change events (e.g. dbserver1.inventory.orderline) { "before": null, "after": { ... }, "source": { ... }, "op": "c", "ts_ms": "1580390884335", "transaction": { "id": "571", "total_order": "1", "data_collection_order": "1" } }
  • 37. Aggregate View Materialization Awareness of Transaction Boundaries Topic with BEGIN/END markers Enable consumers to buffer all events of one transaction @gunnarmorling { "transactionId" : "571", "eventType" : "begin transaction", "ts_ms": 1486500577125 } { "transactionId" : "571", "ts_ms": 1486500577691, "eventType" : "end transaction", "eventCount" : [ { "name" : "dbserver1.inventory.order", "count" : 1 }, { "name" : "dbserver1.inventory.orderLine", "count" : 5 } ] } BEGIN END #Debezium
  • 38. Takeaways Many Use Cases for Debezium and Kafka Streams Data enrichment Creating aggregated events Interactive query services for legacy databases @gunnarmorling#Debezium
  • 39. Takeaways Many Use Cases for Debezium and Kafka Streams Data enrichment Creating aggregated events Interactive query services for legacy databases @gunnarmorling#Debezium Debezium + Kafka Streams =