SlideShare ist ein Scribd-Unternehmen logo
1 von 63
Downloaden Sie, um offline zu lesen
From Zero to Hero
with Kafka Connect
@rmoff
A practical guide to becoming l33t with Kafka Connect
a.k.a.
@rmoff
From Zero to Hero with Kafka Connect
What is
Kafka
Connect?
From Zero to Hero with Kafka Connect
@rmoff
Sources
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
syslog
From Zero to Hero with Kafka Connect
@rmoff
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
Amazon S3
Google BigQuery
Sinks
From Zero to Hero with Kafka Connect
@rmoff
Streaming Integration with Kafka Connect
Kafka Brokers
Kafka Connect
syslog
Amazon S3
Google BigQuery
From Zero to Hero with Kafka Connect
@rmoff
{
"connector.class":
"io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url":
"jdbc:mysql://asgard:3306/demo",
"table.whitelist":
"sales,orders,customers"
}
https://docs.confluent.io/current/connect/
Look Ma, No Code!
From Zero to Hero with Kafka Connect
@rmoff
Streaming Pipelines
RDBMS
Kafka
Connect
Kafka
Connect
Amazon S3
HDFS
From Zero to Hero with Kafka Connect
@rmoff
KafkaConnect
Writing to data stores from Kafka
App
Data
Store
From Zero to Hero with Kafka Connect
@rmoff
Evolve processing from old systems to new
RDBMS
Existing
App
New App
<x>
Kafka
Connect
@rmoff
From Zero to Hero with Kafka Connect
Demo
http:!//rmoff.dev/kafka-connect-code
@rmoff
From Zero to Hero with Kafka Connect
Configuring
Kafka
Connect
Inside the API - connectors, transforms, converters
From Zero to Hero with Kafka Connect
@rmoff
Kafka Connect basics
KafkaKafka ConnectSource
From Zero to Hero with Kafka Connect
@rmoff
Connectors
KafkaKafka ConnectSource
Connector
From Zero to Hero with Kafka Connect
@rmoff
Connectors
"config": {
[...]
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"connection.url": "jdbc:postgresql://postgres:5432/",
"topics": "asgard.demo.orders",
}
From Zero to Hero with Kafka Connect
@rmoff
Connectors
Connect
Record
Native data
Connector
KafkaKafka ConnectSource
From Zero to Hero with Kafka Connect
@rmoff
Converters
Connect
Record
Native data bytes[]
KafkaKafka ConnectSource
Connector Converter
From Zero to Hero with Kafka Connect
@rmoff
Serialisation & Schemas
-> Confluent
Schema Registry
Avro Protobuf JSON CSV
https://qconnewyork.com/system/files/presentation-slides/qcon_17_-_schemas_and_apis.pdf
From Zero to Hero with Kafka Connect
@rmoff
The Confluent Schema Registry
Source
Avro
Message
Target
Schema
RegistryAvro
Schema
Kafka
Connect
Kafka
ConnectAvro
Message
From Zero to Hero with Kafka Connect
@rmoff
Converters
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
Set as a global default per-worker; optionally can be overriden per-connector
From Zero to Hero with Kafka Connect
@rmoff
What about internal converters?
value.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
key.internal.value.converter=org.apache.kafka.connect.json.JsonConverter
value.internal.value.converter=org.apache.kafka.connect.json.JsonConverter
key.internal.value.converter.bork.bork.bork=org.apache.kafka.connect.json.JsonConverter
key.internal.value.please.just.work.converter=org.apache.kafka.connect.json.JsonConverter
From Zero to Hero with Kafka Connect
@rmoff
Single Message Transforms
KafkaKafka ConnectSource
Connector
Transform(s)
Converter
From Zero to Hero with Kafka Connect
@rmoff
Single Message Transforms
"config": {
[...]
"transforms": "addDateToTopic,labelFooBar",
"transforms.addDateToTopic.type": "org.apache.kafka.connect.transforms.TimestampRouter",
"transforms.addDateToTopic.topic.format": "${topic}-${timestamp}",
"transforms.addDateToTopic.timestamp.format": "YYYYMM",
"transforms.labelFooBar.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.labelFooBar.renames": "delivery_address:shipping_address",
}
Do these transforms
Transforms config Config per transform
From Zero to Hero with Kafka Connect
@rmoff
Extensible
Connector
Transform(s)
Converter
From Zero to Hero with Kafka Connect
@rmoff
Confluent Hub
hub.confluent.io
@rmoff
From Zero to Hero with Kafka Connect
Deploying
Kafka
Connect
Connectors, Tasks, and Workers
From Zero to Hero with Kafka Connect
@rmoff
Connectors and Tasks
JDBC Source S3 Sink
JDBC Task #2JDBC Task #1
S3 Task #1
From Zero to Hero with Kafka Connect
@rmoff
Connectors and Tasks
JDBC Source S3 Sink
JDBC Task #2JDBC Task #1
S3 Task #1
From Zero to Hero with Kafka Connect
@rmoff
Connectors and Tasks
JDBC Source S3 Sink
JDBC Task #2JDBC Task #1
S3 Task #1
From Zero to Hero with Kafka Connect
@rmoff
Tasks and Workers
JDBC Source S3 Sink
JDBC Task #2JDBC Task #1
S3 Task #1
Worker
From Zero to Hero with Kafka Connect
@rmoff
Kafka Connect Standalone Worker
JDBC Task #2JDBC Task #1
S3 Task #1
Worker
Offsets
From Zero to Hero with Kafka Connect
@rmoff
"Scaling" the Standalone Worker
JDBC Task #2
JDBC Task #1
S3 Task #1
Worker
OffsetsOffsets
Worker
Fault-tolerant? Nope.
From Zero to Hero with Kafka Connect
@rmoff
JDBC Task #2
Kafka Connect Distributed Worker
JDBC Task #1 JDBC Task #2
S3 Task #1
Offsets
Config
Status
Fault-tolerant? Yeah!
Worker
Kafka Connect cluster
From Zero to Hero with Kafka Connect
@rmoff
Scaling the Distributed Worker
JDBC Task #1 JDBC Task #2
S3 Task #1
Offsets
Config
Status
Fault-tolerant? Yeah!
Worker Worker
Kafka Connect cluster
From Zero to Hero with Kafka Connect
@rmoff
Distributed Worker - fault tolerance
JDBC Task #1
S3 Task #1
Offsets
Config
Status
Worker Worker
Kafka Connect cluster
From Zero to Hero with Kafka Connect
@rmoff
Distributed Worker - fault tolerance
JDBC Task #1
S3 Task #1
Offsets
Config
Status
Worker
Kafka Connect cluster
JDBC Task #2
From Zero to Hero with Kafka Connect
@rmoff
Multiple Distributed Clusters
JDBC Task #1
S3 Task #1
Offsets
Config
Status
Kafka Connect cluster #1
JDBC Task #2
Kafka Connect cluster #2
Offsets
Config
Status
@rmoff
From Zero to Hero with Kafka Connect
Containers
From Zero to Hero with Kafka Connect
@rmoff
Kafka Connect images on Docker Hub
confluentinc/cp-kafka-connect-base
kafka-connect-elasticsearch
kafka-connect-jdbc
kafka-connect-hdfs
[…]
confluentinc/cp-kafka-connect
From Zero to Hero with Kafka Connect
@rmoff
Adding connectors to a container
confluentinc/cp-kafka-connect-base
JAR
Confluent Hub
From Zero to Hero with Kafka Connect
@rmoff
At runtime
JAR
confluentinc/cp-kafka-connect-base
kafka-connect:
image: confluentinc/cp-kafka-connect:5.2.1
environment:
CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components'
command:
- bash
- -c
- |
confluent-hub install --no-prompt neo4j/kafka-connect-neo4j:1.0.0
/etc/confluent/docker/run
http://rmoff.dev/ksln19-connect-docker
From Zero to Hero with Kafka Connect
@rmoff
Build a new image
FROM confluentinc/cp-kafka-connect:5.2.1
ENV CONNECT_PLUGIN_PATH="/usr/share/java,/usr/share/confluent-hub-components"
RUN confluent-hub install --no-prompt neo4j/kafka-connect-neo4j:1.0.0
JAR
confluentinc/cp-kafka-connect-base
From Zero to Hero with Kafka Connect
@rmoff
Automating connector creation
# # Download JDBC drivers
cd /usr/share/java/kafka-connect-jdbc/
curl https:"//cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-8.0.13.tar.gz | tar xz
#
# Now launch Kafka Connect
/etc/confluent/docker/run &
#
# Wait for Kafka Connect listener
while [ $$(curl -s -o /dev/null -w %{http_code} http:"//$$CONNECT_REST_ADVERTISED_HOST_NAME:$…
echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_…
sleep 5
done
#
# Create JDBC Source connector
curl -X POST http:"//localhost:8083/connectors -H "Content-Type: application/json" -d '{
"name": "jdbc_source_mysql_00",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql:"//mysql:3306/demo",
"connection.user": "connect_user",
"connection.password": "asgard",
"topic.prefix": "mysql-00-",
"table.whitelist" : "demo.customers",
}
}'
# Don't let the container die
sleep infinity http://rmoff.dev/ksln19-connect-docker
@rmoff
From Zero to Hero with Kafka Connect
Troubleshooting
Kafka Connect
From Zero to Hero with Kafka Connect
@rmoff
Troubleshooting Kafka Connect
$ curl -s "http://localhost:8083/connectors/source-debezium-orders/status" | 
jq '.connector.state'
"RUNNING"
$ curl -s "http://localhost:8083/connectors/source-debezium-orders/status" | 
jq '.tasks[0].state'
"FAILED"
http://go.rmoff.net/connector-status
RUNNING
FAILED
Connector
Task
From Zero to Hero with Kafka Connect
@rmoff
Troubleshooting Kafka Connect
curl -s "http:!//localhost:8083/connectors/source-debezium-orders-00/status"
| jq '.tasks[0].trace'
"org.apache.kafka.connect.errors.ConnectExceptionntat
io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)ntat
io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)ntat
io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java:
1018)ntat com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)ntat
com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)ntat
com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)ntat java.lang.Thread.run(Thread.java:
748)nCaused by: java.io.EOFExceptionntat
com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190)ntat
com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readInteger(ByteArrayInputStream.java:46)ntat
com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java
:35)ntat
com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java
:27)ntat com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:
212)ntat io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:224)ntat
com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:922)nt!!... 3 moren"
From Zero to Hero with Kafka Connect
@rmoff
The log is the source of truth
$ confluent log connect
$ docker-compose logs kafka-connect
$ cat /var/log/kafka/connect.log
From Zero to Hero with Kafka Connect
@rmoff
Kafka Connect
Symptom not Cause
Task is being killed and will
not recover until manually restarted
"
"
@rmoff
From Zero to Hero with Kafka Connect
Error Handling
and
Dead Letter
Queues
From Zero to Hero with Kafka Connect
@rmoff
org.apache.kafka.common.errors.SerializationException:
Unknown magic byte!
From Zero to Hero with Kafka Connect
@rmoff
Mismatched converters
"value.converter":
"AvroConverter"
org.apache.kafka.common.errors.SerializationException:
Unknown magic byte!
Use the correct Converter for the
source dataⓘ
Messages are not Avro
From Zero to Hero with Kafka Connect
@rmoff
Mixed serialisation methods
"value.converter":
"AvroConverter"
org.apache.kafka.common.errors.SerializationException:
Unknown magic byte!
Some messages are not Avro
Use error handling to deal
with bad messagesⓘ
From Zero to Hero with Kafka Connect
@rmoff
Error Handling and DLQ
Handled
Convert
-> read/write from Kafka
-> [de]-serialisation
Transform
Not Handled
Start
-> Connections to a data store
Poll / Put
-> Read/Write from/to data store*
* can be retried by Connect
https://cnfl.io/connect-dlq
From Zero to Hero with Kafka Connect
@rmoff
Fail Fast
Kafka Connect
Source topic messages
Sink messages
https://cnfl.io/connect-dlq
From Zero to Hero with Kafka Connect
@rmoff
YOLO ¯_(ツ)_/¯
Kafka Connect
Source topic messages
Sink messages
errors.tolerance=all
https://cnfl.io/connect-dlq
From Zero to Hero with Kafka Connect
@rmoff
Dead Letter Queue
Kafka Connect
Source topic messages
Sink messages
Dead
letter
queue
errors.tolerance=all
errors.deadletterqueue.topic.name=my_dlq
https://cnfl.io/connect-dlq
From Zero to Hero with Kafka Connect
@rmoff
Re-processing the Dead Letter Queue
Source topic messages
Sink messages
Dead
letter
queue
Kafka Connect (Avro sink)
Kafka Connect (JSON sink)
https://cnfl.io/connect-dlq
@rmoff
From Zero to Hero with Kafka Connect
Metrics
and
Monitoring
From Zero to Hero with Kafka Connect
@rmoff
REST API
http://go.rmoff.net/connector-status
From Zero to Hero with Kafka Connect
@rmoff
JMX
From Zero to Hero with Kafka Connect
@rmoff
http://cnfl.io/book-bundle
KS19Meetup.
CONFLUENT COMMUNITY DISCOUNT CODE
25% OFF*
*Standard Priced Conference pass
💬 Confluent Community 

Slack group
http://cnfl.io/slack
#EOFhttp://talks.rmoff.net/
@rmoff

Weitere ähnliche Inhalte

Was ist angesagt?

Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
Guido Schmutz
 

Was ist angesagt? (20)

Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connect
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Kafka basics
Kafka basicsKafka basics
Kafka basics
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 

Ähnlich wie From Zero to Hero with Kafka Connect

Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz
 
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
HostedbyConfluent
 

Ähnlich wie From Zero to Hero with Kafka Connect (20)

From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Training
TrainingTraining
Training
 
Service messaging using Kafka
Service messaging using KafkaService messaging using Kafka
Service messaging using Kafka
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
How to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thriveHow to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thrive
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platform
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
 

Mehr von confluent

Mehr von confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

From Zero to Hero with Kafka Connect

  • 1. From Zero to Hero with Kafka Connect @rmoff A practical guide to becoming l33t with Kafka Connect a.k.a.
  • 2. @rmoff From Zero to Hero with Kafka Connect What is Kafka Connect?
  • 3. From Zero to Hero with Kafka Connect @rmoff Sources Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect syslog
  • 4. From Zero to Hero with Kafka Connect @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect Amazon S3 Google BigQuery Sinks
  • 5. From Zero to Hero with Kafka Connect @rmoff Streaming Integration with Kafka Connect Kafka Brokers Kafka Connect syslog Amazon S3 Google BigQuery
  • 6. From Zero to Hero with Kafka Connect @rmoff { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "connection.url": "jdbc:mysql://asgard:3306/demo", "table.whitelist": "sales,orders,customers" } https://docs.confluent.io/current/connect/ Look Ma, No Code!
  • 7. From Zero to Hero with Kafka Connect @rmoff Streaming Pipelines RDBMS Kafka Connect Kafka Connect Amazon S3 HDFS
  • 8. From Zero to Hero with Kafka Connect @rmoff KafkaConnect Writing to data stores from Kafka App Data Store
  • 9. From Zero to Hero with Kafka Connect @rmoff Evolve processing from old systems to new RDBMS Existing App New App <x> Kafka Connect
  • 10. @rmoff From Zero to Hero with Kafka Connect Demo http:!//rmoff.dev/kafka-connect-code
  • 11. @rmoff From Zero to Hero with Kafka Connect Configuring Kafka Connect Inside the API - connectors, transforms, converters
  • 12. From Zero to Hero with Kafka Connect @rmoff Kafka Connect basics KafkaKafka ConnectSource
  • 13. From Zero to Hero with Kafka Connect @rmoff Connectors KafkaKafka ConnectSource Connector
  • 14. From Zero to Hero with Kafka Connect @rmoff Connectors "config": { [...] "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "connection.url": "jdbc:postgresql://postgres:5432/", "topics": "asgard.demo.orders", }
  • 15. From Zero to Hero with Kafka Connect @rmoff Connectors Connect Record Native data Connector KafkaKafka ConnectSource
  • 16. From Zero to Hero with Kafka Connect @rmoff Converters Connect Record Native data bytes[] KafkaKafka ConnectSource Connector Converter
  • 17. From Zero to Hero with Kafka Connect @rmoff Serialisation & Schemas -> Confluent Schema Registry Avro Protobuf JSON CSV https://qconnewyork.com/system/files/presentation-slides/qcon_17_-_schemas_and_apis.pdf
  • 18. From Zero to Hero with Kafka Connect @rmoff The Confluent Schema Registry Source Avro Message Target Schema RegistryAvro Schema Kafka Connect Kafka ConnectAvro Message
  • 19. From Zero to Hero with Kafka Connect @rmoff Converters key.converter=io.confluent.connect.avro.AvroConverter key.converter.schema.registry.url=http://localhost:8081 value.converter=io.confluent.connect.avro.AvroConverter value.converter.schema.registry.url=http://localhost:8081 Set as a global default per-worker; optionally can be overriden per-connector
  • 20. From Zero to Hero with Kafka Connect @rmoff What about internal converters? value.converter=org.apache.kafka.connect.json.JsonConverter internal.value.converter=org.apache.kafka.connect.json.JsonConverter key.internal.value.converter=org.apache.kafka.connect.json.JsonConverter value.internal.value.converter=org.apache.kafka.connect.json.JsonConverter key.internal.value.converter.bork.bork.bork=org.apache.kafka.connect.json.JsonConverter key.internal.value.please.just.work.converter=org.apache.kafka.connect.json.JsonConverter
  • 21. From Zero to Hero with Kafka Connect @rmoff Single Message Transforms KafkaKafka ConnectSource Connector Transform(s) Converter
  • 22. From Zero to Hero with Kafka Connect @rmoff Single Message Transforms "config": { [...] "transforms": "addDateToTopic,labelFooBar", "transforms.addDateToTopic.type": "org.apache.kafka.connect.transforms.TimestampRouter", "transforms.addDateToTopic.topic.format": "${topic}-${timestamp}", "transforms.addDateToTopic.timestamp.format": "YYYYMM", "transforms.labelFooBar.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.labelFooBar.renames": "delivery_address:shipping_address", } Do these transforms Transforms config Config per transform
  • 23. From Zero to Hero with Kafka Connect @rmoff Extensible Connector Transform(s) Converter
  • 24. From Zero to Hero with Kafka Connect @rmoff Confluent Hub hub.confluent.io
  • 25. @rmoff From Zero to Hero with Kafka Connect Deploying Kafka Connect Connectors, Tasks, and Workers
  • 26. From Zero to Hero with Kafka Connect @rmoff Connectors and Tasks JDBC Source S3 Sink JDBC Task #2JDBC Task #1 S3 Task #1
  • 27. From Zero to Hero with Kafka Connect @rmoff Connectors and Tasks JDBC Source S3 Sink JDBC Task #2JDBC Task #1 S3 Task #1
  • 28. From Zero to Hero with Kafka Connect @rmoff Connectors and Tasks JDBC Source S3 Sink JDBC Task #2JDBC Task #1 S3 Task #1
  • 29. From Zero to Hero with Kafka Connect @rmoff Tasks and Workers JDBC Source S3 Sink JDBC Task #2JDBC Task #1 S3 Task #1 Worker
  • 30. From Zero to Hero with Kafka Connect @rmoff Kafka Connect Standalone Worker JDBC Task #2JDBC Task #1 S3 Task #1 Worker Offsets
  • 31. From Zero to Hero with Kafka Connect @rmoff "Scaling" the Standalone Worker JDBC Task #2 JDBC Task #1 S3 Task #1 Worker OffsetsOffsets Worker Fault-tolerant? Nope.
  • 32. From Zero to Hero with Kafka Connect @rmoff JDBC Task #2 Kafka Connect Distributed Worker JDBC Task #1 JDBC Task #2 S3 Task #1 Offsets Config Status Fault-tolerant? Yeah! Worker Kafka Connect cluster
  • 33. From Zero to Hero with Kafka Connect @rmoff Scaling the Distributed Worker JDBC Task #1 JDBC Task #2 S3 Task #1 Offsets Config Status Fault-tolerant? Yeah! Worker Worker Kafka Connect cluster
  • 34. From Zero to Hero with Kafka Connect @rmoff Distributed Worker - fault tolerance JDBC Task #1 S3 Task #1 Offsets Config Status Worker Worker Kafka Connect cluster
  • 35. From Zero to Hero with Kafka Connect @rmoff Distributed Worker - fault tolerance JDBC Task #1 S3 Task #1 Offsets Config Status Worker Kafka Connect cluster JDBC Task #2
  • 36. From Zero to Hero with Kafka Connect @rmoff Multiple Distributed Clusters JDBC Task #1 S3 Task #1 Offsets Config Status Kafka Connect cluster #1 JDBC Task #2 Kafka Connect cluster #2 Offsets Config Status
  • 37. @rmoff From Zero to Hero with Kafka Connect Containers
  • 38. From Zero to Hero with Kafka Connect @rmoff Kafka Connect images on Docker Hub confluentinc/cp-kafka-connect-base kafka-connect-elasticsearch kafka-connect-jdbc kafka-connect-hdfs […] confluentinc/cp-kafka-connect
  • 39. From Zero to Hero with Kafka Connect @rmoff Adding connectors to a container confluentinc/cp-kafka-connect-base JAR Confluent Hub
  • 40. From Zero to Hero with Kafka Connect @rmoff At runtime JAR confluentinc/cp-kafka-connect-base kafka-connect: image: confluentinc/cp-kafka-connect:5.2.1 environment: CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components' command: - bash - -c - | confluent-hub install --no-prompt neo4j/kafka-connect-neo4j:1.0.0 /etc/confluent/docker/run http://rmoff.dev/ksln19-connect-docker
  • 41. From Zero to Hero with Kafka Connect @rmoff Build a new image FROM confluentinc/cp-kafka-connect:5.2.1 ENV CONNECT_PLUGIN_PATH="/usr/share/java,/usr/share/confluent-hub-components" RUN confluent-hub install --no-prompt neo4j/kafka-connect-neo4j:1.0.0 JAR confluentinc/cp-kafka-connect-base
  • 42. From Zero to Hero with Kafka Connect @rmoff Automating connector creation # # Download JDBC drivers cd /usr/share/java/kafka-connect-jdbc/ curl https:"//cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-8.0.13.tar.gz | tar xz # # Now launch Kafka Connect /etc/confluent/docker/run & # # Wait for Kafka Connect listener while [ $$(curl -s -o /dev/null -w %{http_code} http:"//$$CONNECT_REST_ADVERTISED_HOST_NAME:$… echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_… sleep 5 done # # Create JDBC Source connector curl -X POST http:"//localhost:8083/connectors -H "Content-Type: application/json" -d '{ "name": "jdbc_source_mysql_00", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "connection.url": "jdbc:mysql:"//mysql:3306/demo", "connection.user": "connect_user", "connection.password": "asgard", "topic.prefix": "mysql-00-", "table.whitelist" : "demo.customers", } }' # Don't let the container die sleep infinity http://rmoff.dev/ksln19-connect-docker
  • 43. @rmoff From Zero to Hero with Kafka Connect Troubleshooting Kafka Connect
  • 44. From Zero to Hero with Kafka Connect @rmoff Troubleshooting Kafka Connect $ curl -s "http://localhost:8083/connectors/source-debezium-orders/status" | jq '.connector.state' "RUNNING" $ curl -s "http://localhost:8083/connectors/source-debezium-orders/status" | jq '.tasks[0].state' "FAILED" http://go.rmoff.net/connector-status RUNNING FAILED Connector Task
  • 45. From Zero to Hero with Kafka Connect @rmoff Troubleshooting Kafka Connect curl -s "http:!//localhost:8083/connectors/source-debezium-orders-00/status" | jq '.tasks[0].trace' "org.apache.kafka.connect.errors.ConnectExceptionntat io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)ntat io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)ntat io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java: 1018)ntat com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)ntat com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)ntat com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)ntat java.lang.Thread.run(Thread.java: 748)nCaused by: java.io.EOFExceptionntat com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190)ntat com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readInteger(ByteArrayInputStream.java:46)ntat com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java :35)ntat com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java :27)ntat com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java: 212)ntat io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:224)ntat com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:922)nt!!... 3 moren"
  • 46. From Zero to Hero with Kafka Connect @rmoff The log is the source of truth $ confluent log connect $ docker-compose logs kafka-connect $ cat /var/log/kafka/connect.log
  • 47. From Zero to Hero with Kafka Connect @rmoff Kafka Connect Symptom not Cause Task is being killed and will not recover until manually restarted " "
  • 48. @rmoff From Zero to Hero with Kafka Connect Error Handling and Dead Letter Queues
  • 49. From Zero to Hero with Kafka Connect @rmoff org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
  • 50. From Zero to Hero with Kafka Connect @rmoff Mismatched converters "value.converter": "AvroConverter" org.apache.kafka.common.errors.SerializationException: Unknown magic byte! Use the correct Converter for the source dataⓘ Messages are not Avro
  • 51. From Zero to Hero with Kafka Connect @rmoff Mixed serialisation methods "value.converter": "AvroConverter" org.apache.kafka.common.errors.SerializationException: Unknown magic byte! Some messages are not Avro Use error handling to deal with bad messagesⓘ
  • 52. From Zero to Hero with Kafka Connect @rmoff Error Handling and DLQ Handled Convert -> read/write from Kafka -> [de]-serialisation Transform Not Handled Start -> Connections to a data store Poll / Put -> Read/Write from/to data store* * can be retried by Connect https://cnfl.io/connect-dlq
  • 53. From Zero to Hero with Kafka Connect @rmoff Fail Fast Kafka Connect Source topic messages Sink messages https://cnfl.io/connect-dlq
  • 54. From Zero to Hero with Kafka Connect @rmoff YOLO ¯_(ツ)_/¯ Kafka Connect Source topic messages Sink messages errors.tolerance=all https://cnfl.io/connect-dlq
  • 55. From Zero to Hero with Kafka Connect @rmoff Dead Letter Queue Kafka Connect Source topic messages Sink messages Dead letter queue errors.tolerance=all errors.deadletterqueue.topic.name=my_dlq https://cnfl.io/connect-dlq
  • 56. From Zero to Hero with Kafka Connect @rmoff Re-processing the Dead Letter Queue Source topic messages Sink messages Dead letter queue Kafka Connect (Avro sink) Kafka Connect (JSON sink) https://cnfl.io/connect-dlq
  • 57. @rmoff From Zero to Hero with Kafka Connect Metrics and Monitoring
  • 58. From Zero to Hero with Kafka Connect @rmoff REST API http://go.rmoff.net/connector-status
  • 59. From Zero to Hero with Kafka Connect @rmoff JMX
  • 60. From Zero to Hero with Kafka Connect @rmoff http://cnfl.io/book-bundle
  • 61. KS19Meetup. CONFLUENT COMMUNITY DISCOUNT CODE 25% OFF* *Standard Priced Conference pass
  • 62. 💬 Confluent Community 
 Slack group http://cnfl.io/slack