SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Kafka 101 &
Developer Best Practices
Agenda
● Kafka Overview
● Kafka 101
● Best Practices for Writing to Kafka: A tour of the
Producer
● Best Practices for Reading from Kafka: The
Consumer
● General Considerations
3
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
DiïŹƒcult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
4
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
DiïŹƒcult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
5
ETL/Data Integration
Batch
Expensive
Time Consuming
Messaging
DiïŹƒcult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Transient Messages
Stored records
6
ETL/Data Integration
Batch
Expensive
Time Consuming
Messaging
DiïŹƒcult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Transient Messages
Stored records
Both of these are a complete mismatch
to how your business works.
7
ETL/Data Integration Messaging
Transient Messages
Stored records
ETL/Data Integration Messaging
Messaging
Batch
Expensive
Time Consuming
DiïŹƒcult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Event Streaming Paradigm
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
8
Fast (Low Latency)
Event Streaming Paradigm
To rethink data as not stored records
or transient messages, but instead as
a continually updating stream of events
9
Fast (Low Latency)
Event Streaming Paradigm
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
...
Device
Logs ... ...
...
Data Stores Logs 3rd Party Apps Custom Apps / Microservices
Real-time
Customer 360
Financial Fraud
Detection
Real-time
Risk Analytics
Real-time
Payments
Machine
Learning
Models
...
Event-Streaming Applications
Universal Event Pipeline
Amazon
S3
SaaS
apps
ConïŹ‚uent: Central Nervous System For Enterprise
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
ConïŹ‚uent uniquely
enables Event Streaming
success
Hall of Innovation
CTO Innovation
Award Winner
2019
Enterprise Technology
Innovation
AWARDS
ConïŹ‚uent founders are
original creators of Kafka
ConïŹ‚uent team wrote 80%
of Kafka commits and has
over 1M hours technical
experience with Kafka
ConïŹ‚uent helps enterprises
successfully deploy event
streaming at scale and
accelerate time to market
ConïŹ‚uent Platform extends
Apache Kafka to be a
secure, enterprise-ready
platform
Kafka 101
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Scalability of a
Filesystem
Guarantees of a
Database
Distributed By
Design
Rewind and Replay
15
Kafka Broker
Kafka Broker
Kafka Broker
Kafka Broker
Kafka
Producer
Kafka
Producer
Kafka
Producer
Kafka
Consumer
Kafka
Consumer
(grouped)
Kafka
Consumer
/Producer
Kafka Broker
Writers Readers
Kafka Cluster
KAFKA
A MODERN, DISTRIBUTED
PLATFORM FOR DATA
STREAMS
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Kafka Topics
my-topic
my-topic-partition-0
my-topic-partition-1
my-topic-partition-2
broker-1
broker-2
broker-3
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Creating a topic
$ kafka-topics --zookeeper zk:2181
--create 
--topic my-topic 
--replication-factor 3 
--partitions 3
Or use the AdminClient API!
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producing to Kafka
Time
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producing to Kafka
Time
C C
C
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Kafka’s distributed nature
Broker 1
Topic-1
partition-1
Broker 2 Broker 3 Broker 4
Topic-1
partition-1
Topic-1
partition-1
Leader Follower
Topic-1
partition-2
Topic-1
partition-2
Topic-1
partition-2
Topic-1
partition-3
Topic-1
partition-4
Topic-1
partition-3
Topic-1
partition-3
Topic-1
partition-4
Topic-1
partition-4
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Kafka’s distributed nature
Broker 1
Topic-1
partition-1
Broker 2 Broker 3 Broker 4
Topic-1
partition-1
Topic-1
partition-1
Leader Follower
Topic-1
partition-2
Topic-1
partition-2
Topic-1
partition-2
Topic-1
partition-3
Topic-1
partition-4
Topic-1
partition-3
Topic-1
partition-3
Topic-1
partition-4
Topic-1
partition-4
Producing to Kafka
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer
Clients - Producer Design
Producer Record
Topic
[Partition]
[Key]
Value
Serializer Partitioner
Topic A
Partition 0
Batch 0
Batch 1
Batch 2
Topic B
Partition 1
Batch 0
Batch 1
Batch 2
Kafka
Broker
Send()
Retry
?
Fail
?
Yes
No
Can’t retry, throw
exception
Success: return
metadata
Yes
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
The Serializer
Kafka doesn’t care about what you send to it as long as
it’s been converted to a byte stream beforehand.
JSON
CSV
Avro
Protobufs
XML
SERIALIZERS
01001010 01010011 01001111 01001110
01000011 01010011 01010110
01001010 01010011 01001111 01001110
01010000 01110010 01101111 01110100 ...
01011000 01001101 01001100
(if you must)
Reference
https://kafka.apache.org/10/documentation/streams/developer-guide/datatypes.html
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
The Serializer
private Properties kafkaProps = new Properties();
kafkaProps.put(“bootstrap.servers”, “broker1:9092,broker2:9092”);
kafkaProps.put(“key.serializer”, “org.apache.kafka.common.serialization.StringSerializer”);
kafkaProps.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
kafkaProps.put("schema.registry.url", "https://schema-registry:8083");
producer = new KafkaProducer<String, SpecificRecord>(kafkaProps);
Reference
https://kafka.apache.org/10/documentation/streams/developer-guide/datatypes.html
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Record
Topic
[Partition]
[Key]
Value
Record keys determine the partition with the default kafka
partitioner
If a key isn’t provided, messages will be
produced in a round robin fashion
partitioner
Record Keys and why they’re important -
Ordering
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Keys are used in the default partitioning algorithm:
partition = hash(key) % numPartitions
Record Keys and why they’re important -
Ordering
Producer Record
Topic
[Partition]
AAA
Value
partitioner
Record keys determine the partition with the default kafka
partitioner
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Keys are used in the default partitioning algorithm:
partition = hash(key) % numPartitions
Record Keys and why they’re important -
Ordering
Producer Record
Topic
[Partition]
BBB
Value
partitioner
Record keys determine the partition with the default kafka
partitioner
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Keys are used in the default partitioning algorithm:
partition = hash(key) % numPartitions
Record Keys and why they’re important -
Ordering
Producer Record
Topic
[Partition]
CCC
Value
partitioner
Record keys determine the partition with the default kafka
partitioner
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Keys are used in the default partitioning algorithm:
partition = hash(key) % numPartitions
Record Keys and why they’re important -
Ordering
Producer Record
Topic
[Partition]
DDD
Value
partitioner
Record keys determine the partition with the default kafka
partitioner
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Record Keys and why they’re important - Key
Cardinality
Consumers
Key cardinality affects the amount
of work done by the individual
consumers in a group. Poor key
choice can lead to uneven
workloads.
Keys in Kafka don’t have to be
primitives, like strings or ints. Like
values, they can be be anything:
JSON, Avro, etc
 So create a key
that will evenly distribute groups of
records around the partitions.
Car·di·nal·i·ty
/ˌkĂ€rdəˈnalədē/
Noun
the number of elements in a set or other grouping, as a property of that grouping.
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
{
“Name”: “John Smith”,
“Address”: “123 Apple St.”,
“Zip”: “19101”
}
You don’t have to but... use a Schema!
Data
Producer
Service
Data
Consumer
Service
{
“Name”: “John Smith”,
“Address”: “123 Apple St.”,
“City”: “Philadelphia”,
“State”: “PA”,
“Zip”: “19101”
}
send JSON
“Where’s record.City?”
Reference
https://www.conïŹ‚uent.io/blog/schema-registry-kafka-stream-processing-yes-virginia-you
-really-need-one/
Schema Registry: Make Data Backwards Compatible and Future-Proof
● DeïŹne the expected ïŹelds for each Kafka topic
● Automatically handle schema changes (e.g. new
ïŹelds)
● Prevent backwards incompatible
changes
● Support multi-data center environments
Elastic
Cassandra
HDFS
Example Consumers
Serializer
App 1
Serializer
App 2
!
Kafka Topic
!
Schema
Registry
Open Source Feature
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
{
“Name”: “John Smith”,
“Address”: “123 Apple St.”,
“Zip”: “19101”,
“City”: “NA”,
“State”: “NA”
}
Avro allows for evolution of schemas
Data
Producer
Service
Data
Consumer
Service
{
“Name”: “John Smith”,
“Address”: “123 Apple St.”,
“City”: “Philadelphia”,
“State”: “PA”,
“Zip”: “19101”
}
send AvroRecord
Schema
Registry
Version 1
Version 2
Reference
https://www.conïŹ‚uent.io/blog/schema-registry-kafka-stream-processing-yes-virginia-you
-really-need-one/
Developing with ConïŹ‚uent Schema Registry
We provide several Maven plugins for developing with
the ConïŹ‚uent Schema Registry
● download - download a subject’s schema to
your project
● register - register a new schema to the
schema registry from your development env
● test-compatibility - test changes made to
a schema against compatibility rules set by the
schema registry
Reference
https://docs.conïŹ‚uent.io/current/schema-registry/docs/maven-plugin.html
<plugin>
<groupId>io.confluent</groupId>
<artifactId>kafka-schema-registry-maven-plug
<version>5.0.0</version>
<configuration>
<schemaRegistryUrls>
<param>http://192.168.99.100:808
</schemaRegistryUrls>
<outputDirectory>src/main/avro</outp
<subjectPatterns>
<param>^TestSubject000-(key|valu
</subjectPatterns>
</configuration>
</plugin>
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Use Kafka’s Headers
Reference
https://cwiki.apache.org/conïŹ‚uence/display/KAFKA/KIP-82+-+Add+Record+Headers
Producer Record
Topic
[Partition]
[Timestamp]
Value
[Headers]
[Key]
Kafka Headers are simply an interface that requires a key of type
String, and a value of type byte[], the headers are stored in an
iterator in the ProducerRecord .
Example Use Cases
● Data lineage: reference previous topic partition/offsets
● Producing host/application/owner
● Message routing
● Encryption metadata (which key pair was this message
payload encrypted with?)
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=0
Reference
https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p
ossible-heres-how-apache-kafka-does-it/
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
ack
Producer Properties
acks=1
Reference
https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p
ossible-heres-how-apache-kafka-does-it/
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=all
min.insync.replica=2
ack
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=all
min.insync.replica=2
ack
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees - without exactly once
guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=all
min.insync.replica=2
{key: 1234 data: abcd} - offset 3345
Failed ack
Successful write
Reference
https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p
ossible-heres-how-apache-kafka-does-it/
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees - without exactly once
guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
acks=all
min.insync.replica=2
{key: 1234, data: abcd} - offset 3345
{key: 1234, data: abcd} - offset 3346
retry
ack
dupe!
Reference
https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p
ossible-heres-how-apache-kafka-does-it/
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Producer Guarantees - with exactly once
guarantees
P
Broker 1 Broker 2 Broker 3
Topic1
partition1
Leader Follower
Topic1
partition1
Topic1
partition1
Producer Properties
enable.idempotence=true
max.inflight.requests.per.connection=5
acks = “all”
retries > 0 (preferably MAX_INT)
(pid, seq) [payload]
(100, 1) {key: 1234, data: abcd} - offset 3345
(100, 1) {key: 1234, data: abcd} - rejected, ack re-sent
(100, 2) {key: 5678, data: efgh} - offset 3346
retry
ack
no dupe!
Reference
https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p
ossible-heres-how-apache-kafka-does-it/
Transactional Producer
Producer
T1 T1 T1 T1 T1
KafkaProducer producer = createKafkaProducer(
“bootstrap.servers”, “broker:9092”,
“transactional.id”, “my-transactional-id”);
producer.initTransactions();
-- send some records --
producer.commitTransaction();
Consumer
KafkaConsumer consumer = createKafkaConsumer(
“bootstrap.servers”, “broker:9092”,
“group.id”, “my-group-id”,
"isolation.level", "read_committed");
Reference
https://www.conïŹ‚uent.io/blog/transactions-apache-kafka/
Consuming from Kafka
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
A basic Java Consumer
final Consumer<String, String> consumer = new KafkaConsumer<String,
String>(props);
consumer.subscribe(Arrays.asList(topic));
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
-- Do Some Work --
}
}
} finally {
consumer.close();
}
}
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Single Consumer
C
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Grouped Consumers
C
C
C1
C
C
C2
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Grouped Consumers
C C
C C
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Grouped Consumers
0 1
2 3
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Grouped Consumers
0 1
2 3
Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
Consuming From Kafka - Grouped Consumers
0, 3 1
2 3
Kafka’s Interceptors
ProducerInterceptor
onSend(ProducerRecord<K, V> record)
Returns ProducerRecord<K, V> . Called from
send() before key and value get serialized and
partition is assigned. This method is allowed to
modify the record.
onAcknowledgement(RecordMetadata metadata,
java.lang.Exception exception)
This method is called when the record sent to the
server has been acknowledged, or when sending
the record fails before it gets sent to the server.
Used for observability and reporting.
ConsumerInterceptor
onConsume(ConsumerRecords<K,V> records)
Called just before the records are returned by
KafkaConsumer.poll()
This method is allowed to modify consumer
records, in which case the new records will be
returned.
onCommit(Map<TopicPartition,OffsetAndMetada
ta> offsets)
This is called when offsets get committed.
Used for observability and reporting
Reference
https://kafka.apache.org/20/javadoc/org/apache/kafka/clien
ts/producer/ProducerInterceptor.html
Reference
https://kafka.apache.org/20/javadoc/org/apache/kafka/clien
ts/consumer/ConsumerInterceptor.html
Should I pool connections?
NO!
Since Kafka connections are long-lived, there is no reason to
pool connections. It’s common to keep one connection per
thread.
Use a good client!
Clients
● Java/Scala - default clients, comes with Kafka
● C/C++ - https://github.com/edenhill/librdkafka
● C#/.Net - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-dotnet
● Python - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-python
● Golang - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-go
● Node/JavaScript - https://github.com/Blizzard/node-rdkafka (not supported by ConïŹ‚uent!)
New Kafka features will only be available to modern, updated clients!
Resources
Free E-Books from ConïŹ‚uent!
I Heart Logs:
https://www.conïŹ‚uent.io/ebook/i-heart-logs-event-data-stream-processing-and-data-integration/
Kafka: The DeïŹnitive Guide: https://www.conïŹ‚uent.io/resources/kafka-the-deïŹnitive-guide/
Designing Event Driven Systems:
https://www.confluent.io/designing-event-driven-systems/
ConïŹ‚uent Blog: https://www.conïŹ‚uent.io/blog
Thank You!
Thank you!
pascal@conïŹ‚uent.io

Weitere Àhnliche Inhalte

Was ist angesagt?

Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Lucas Jellema
 

Was ist angesagt? (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystem
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
Microservices, Apache Kafka, Node, Dapr and more - Part Two (Fontys Hogeschoo...
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Monitoring Apache Kafka
Monitoring Apache KafkaMonitoring Apache Kafka
Monitoring Apache Kafka
 

Ähnlich wie Kafka 101 and Developer Best Practices

SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
Chester Chen
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
HostedbyConfluent
 

Ähnlich wie Kafka 101 and Developer Best Practices (20)

Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
 
Understanding Apache KafkaÂź Latency at Scale
Understanding Apache KafkaÂź Latency at ScaleUnderstanding Apache KafkaÂź Latency at Scale
Understanding Apache KafkaÂź Latency at Scale
 
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Kafka Connect by Datio
Kafka Connect by DatioKafka Connect by Datio
Kafka Connect by Datio
 
SFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a ProSFBigAnalytics_20190724: Monitor kafka like a Pro
SFBigAnalytics_20190724: Monitor kafka like a Pro
 
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badly
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache KafkaÂź)
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
 
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Qlik Replicate - Kafka Target - Takeaway.pdf
Qlik Replicate - Kafka Target - Takeaway.pdfQlik Replicate - Kafka Target - Takeaway.pdf
Qlik Replicate - Kafka Target - Takeaway.pdf
 
Real-World Pulsar Architectural Patterns
Real-World Pulsar Architectural PatternsReal-World Pulsar Architectural Patterns
Real-World Pulsar Architectural Patterns
 

Mehr von confluent

Mehr von confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop hĂ­brido: Stream Processing con Flink
Workshop hĂ­brido: Stream Processing con FlinkWorkshop hĂ­brido: Stream Processing con Flink
Workshop hĂ­brido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

KĂŒrzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

KĂŒrzlich hochgeladen (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Kafka 101 and Developer Best Practices

  • 1. Kafka 101 & Developer Best Practices
  • 2. Agenda ● Kafka Overview ● Kafka 101 ● Best Practices for Writing to Kafka: A tour of the Producer ● Best Practices for Reading from Kafka: The Consumer ● General Considerations
  • 3. 3 ETL/Data Integration Messaging Batch Expensive Time Consuming DiïŹƒcult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency)
  • 4. 4 ETL/Data Integration Messaging Batch Expensive Time Consuming DiïŹƒcult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency)
  • 5. 5 ETL/Data Integration Batch Expensive Time Consuming Messaging DiïŹƒcult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Transient Messages Stored records
  • 6. 6 ETL/Data Integration Batch Expensive Time Consuming Messaging DiïŹƒcult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Transient Messages Stored records Both of these are a complete mismatch to how your business works.
  • 7. 7 ETL/Data Integration Messaging Transient Messages Stored records ETL/Data Integration Messaging Messaging Batch Expensive Time Consuming DiïŹƒcult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Event Streaming Paradigm High Throughput Durable Persistent Maintains Order Fast (Low Latency)
  • 8. 8 Fast (Low Latency) Event Streaming Paradigm To rethink data as not stored records or transient messages, but instead as a continually updating stream of events
  • 9. 9 Fast (Low Latency) Event Streaming Paradigm
  • 10. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. ... Device Logs ... ... ... Data Stores Logs 3rd Party Apps Custom Apps / Microservices Real-time Customer 360 Financial Fraud Detection Real-time Risk Analytics Real-time Payments Machine Learning Models ... Event-Streaming Applications Universal Event Pipeline Amazon S3 SaaS apps ConïŹ‚uent: Central Nervous System For Enterprise
  • 11. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. ConïŹ‚uent uniquely enables Event Streaming success Hall of Innovation CTO Innovation Award Winner 2019 Enterprise Technology Innovation AWARDS ConïŹ‚uent founders are original creators of Kafka ConïŹ‚uent team wrote 80% of Kafka commits and has over 1M hours technical experience with Kafka ConïŹ‚uent helps enterprises successfully deploy event streaming at scale and accelerate time to market ConïŹ‚uent Platform extends Apache Kafka to be a secure, enterprise-ready platform
  • 13. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc.
  • 14. Scalability of a Filesystem Guarantees of a Database Distributed By Design Rewind and Replay
  • 15. 15 Kafka Broker Kafka Broker Kafka Broker Kafka Broker Kafka Producer Kafka Producer Kafka Producer Kafka Consumer Kafka Consumer (grouped) Kafka Consumer /Producer Kafka Broker Writers Readers Kafka Cluster
  • 17. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Kafka Topics my-topic my-topic-partition-0 my-topic-partition-1 my-topic-partition-2 broker-1 broker-2 broker-3
  • 18. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Creating a topic $ kafka-topics --zookeeper zk:2181 --create --topic my-topic --replication-factor 3 --partitions 3 Or use the AdminClient API!
  • 19. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producing to Kafka Time
  • 20. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producing to Kafka Time C C C
  • 21. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Kafka’s distributed nature Broker 1 Topic-1 partition-1 Broker 2 Broker 3 Broker 4 Topic-1 partition-1 Topic-1 partition-1 Leader Follower Topic-1 partition-2 Topic-1 partition-2 Topic-1 partition-2 Topic-1 partition-3 Topic-1 partition-4 Topic-1 partition-3 Topic-1 partition-3 Topic-1 partition-4 Topic-1 partition-4
  • 22. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Kafka’s distributed nature Broker 1 Topic-1 partition-1 Broker 2 Broker 3 Broker 4 Topic-1 partition-1 Topic-1 partition-1 Leader Follower Topic-1 partition-2 Topic-1 partition-2 Topic-1 partition-2 Topic-1 partition-3 Topic-1 partition-4 Topic-1 partition-3 Topic-1 partition-3 Topic-1 partition-4 Topic-1 partition-4
  • 24. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Clients - Producer Design Producer Record Topic [Partition] [Key] Value Serializer Partitioner Topic A Partition 0 Batch 0 Batch 1 Batch 2 Topic B Partition 1 Batch 0 Batch 1 Batch 2 Kafka Broker Send() Retry ? Fail ? Yes No Can’t retry, throw exception Success: return metadata Yes
  • 25. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. The Serializer Kafka doesn’t care about what you send to it as long as it’s been converted to a byte stream beforehand. JSON CSV Avro Protobufs XML SERIALIZERS 01001010 01010011 01001111 01001110 01000011 01010011 01010110 01001010 01010011 01001111 01001110 01010000 01110010 01101111 01110100 ... 01011000 01001101 01001100 (if you must) Reference https://kafka.apache.org/10/documentation/streams/developer-guide/datatypes.html
  • 26. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. The Serializer private Properties kafkaProps = new Properties(); kafkaProps.put(“bootstrap.servers”, “broker1:9092,broker2:9092”); kafkaProps.put(“key.serializer”, “org.apache.kafka.common.serialization.StringSerializer”); kafkaProps.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer"); kafkaProps.put("schema.registry.url", "https://schema-registry:8083"); producer = new KafkaProducer<String, SpecificRecord>(kafkaProps); Reference https://kafka.apache.org/10/documentation/streams/developer-guide/datatypes.html
  • 27. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Record Topic [Partition] [Key] Value Record keys determine the partition with the default kafka partitioner If a key isn’t provided, messages will be produced in a round robin fashion partitioner Record Keys and why they’re important - Ordering
  • 28. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Keys are used in the default partitioning algorithm: partition = hash(key) % numPartitions Record Keys and why they’re important - Ordering Producer Record Topic [Partition] AAA Value partitioner Record keys determine the partition with the default kafka partitioner
  • 29. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Keys are used in the default partitioning algorithm: partition = hash(key) % numPartitions Record Keys and why they’re important - Ordering Producer Record Topic [Partition] BBB Value partitioner Record keys determine the partition with the default kafka partitioner
  • 30. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Keys are used in the default partitioning algorithm: partition = hash(key) % numPartitions Record Keys and why they’re important - Ordering Producer Record Topic [Partition] CCC Value partitioner Record keys determine the partition with the default kafka partitioner
  • 31. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Keys are used in the default partitioning algorithm: partition = hash(key) % numPartitions Record Keys and why they’re important - Ordering Producer Record Topic [Partition] DDD Value partitioner Record keys determine the partition with the default kafka partitioner
  • 32. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Record Keys and why they’re important - Key Cardinality Consumers Key cardinality affects the amount of work done by the individual consumers in a group. Poor key choice can lead to uneven workloads. Keys in Kafka don’t have to be primitives, like strings or ints. Like values, they can be be anything: JSON, Avro, etc
 So create a key that will evenly distribute groups of records around the partitions. Car·di·nal·i·ty /ˌkĂ€rdəˈnalədē/ Noun the number of elements in a set or other grouping, as a property of that grouping.
  • 33. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. { “Name”: “John Smith”, “Address”: “123 Apple St.”, “Zip”: “19101” } You don’t have to but... use a Schema! Data Producer Service Data Consumer Service { “Name”: “John Smith”, “Address”: “123 Apple St.”, “City”: “Philadelphia”, “State”: “PA”, “Zip”: “19101” } send JSON “Where’s record.City?” Reference https://www.conïŹ‚uent.io/blog/schema-registry-kafka-stream-processing-yes-virginia-you -really-need-one/
  • 34. Schema Registry: Make Data Backwards Compatible and Future-Proof ● DeïŹne the expected ïŹelds for each Kafka topic ● Automatically handle schema changes (e.g. new ïŹelds) ● Prevent backwards incompatible changes ● Support multi-data center environments Elastic Cassandra HDFS Example Consumers Serializer App 1 Serializer App 2 ! Kafka Topic ! Schema Registry Open Source Feature
  • 35. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. { “Name”: “John Smith”, “Address”: “123 Apple St.”, “Zip”: “19101”, “City”: “NA”, “State”: “NA” } Avro allows for evolution of schemas Data Producer Service Data Consumer Service { “Name”: “John Smith”, “Address”: “123 Apple St.”, “City”: “Philadelphia”, “State”: “PA”, “Zip”: “19101” } send AvroRecord Schema Registry Version 1 Version 2 Reference https://www.conïŹ‚uent.io/blog/schema-registry-kafka-stream-processing-yes-virginia-you -really-need-one/
  • 36. Developing with ConïŹ‚uent Schema Registry We provide several Maven plugins for developing with the ConïŹ‚uent Schema Registry ● download - download a subject’s schema to your project ● register - register a new schema to the schema registry from your development env ● test-compatibility - test changes made to a schema against compatibility rules set by the schema registry Reference https://docs.conïŹ‚uent.io/current/schema-registry/docs/maven-plugin.html <plugin> <groupId>io.confluent</groupId> <artifactId>kafka-schema-registry-maven-plug <version>5.0.0</version> <configuration> <schemaRegistryUrls> <param>http://192.168.99.100:808 </schemaRegistryUrls> <outputDirectory>src/main/avro</outp <subjectPatterns> <param>^TestSubject000-(key|valu </subjectPatterns> </configuration> </plugin>
  • 37. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Use Kafka’s Headers Reference https://cwiki.apache.org/conïŹ‚uence/display/KAFKA/KIP-82+-+Add+Record+Headers Producer Record Topic [Partition] [Timestamp] Value [Headers] [Key] Kafka Headers are simply an interface that requires a key of type String, and a value of type byte[], the headers are stored in an iterator in the ProducerRecord . Example Use Cases ● Data lineage: reference previous topic partition/offsets ● Producing host/application/owner ● Message routing ● Encryption metadata (which key pair was this message payload encrypted with?)
  • 38. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties acks=0 Reference https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 39. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 ack Producer Properties acks=1 Reference https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 40. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties acks=all min.insync.replica=2 ack
  • 41. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties acks=all min.insync.replica=2 ack
  • 42. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees - without exactly once guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties acks=all min.insync.replica=2 {key: 1234 data: abcd} - offset 3345 Failed ack Successful write Reference https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 43. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees - without exactly once guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties acks=all min.insync.replica=2 {key: 1234, data: abcd} - offset 3345 {key: 1234, data: abcd} - offset 3346 retry ack dupe! Reference https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 44. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Producer Guarantees - with exactly once guarantees P Broker 1 Broker 2 Broker 3 Topic1 partition1 Leader Follower Topic1 partition1 Topic1 partition1 Producer Properties enable.idempotence=true max.inflight.requests.per.connection=5 acks = “all” retries > 0 (preferably MAX_INT) (pid, seq) [payload] (100, 1) {key: 1234, data: abcd} - offset 3345 (100, 1) {key: 1234, data: abcd} - rejected, ack re-sent (100, 2) {key: 5678, data: efgh} - offset 3346 retry ack no dupe! Reference https://www.conïŹ‚uent.io/blog/exactly-once-semantics-are-p ossible-heres-how-apache-kafka-does-it/
  • 45. Transactional Producer Producer T1 T1 T1 T1 T1 KafkaProducer producer = createKafkaProducer( “bootstrap.servers”, “broker:9092”, “transactional.id”, “my-transactional-id”); producer.initTransactions(); -- send some records -- producer.commitTransaction(); Consumer KafkaConsumer consumer = createKafkaConsumer( “bootstrap.servers”, “broker:9092”, “group.id”, “my-group-id”, "isolation.level", "read_committed"); Reference https://www.conïŹ‚uent.io/blog/transactions-apache-kafka/
  • 47. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. A basic Java Consumer final Consumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList(topic)); try { while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { -- Do Some Work -- } } } finally { consumer.close(); } }
  • 48. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Single Consumer C
  • 49. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Grouped Consumers C C C1 C C C2
  • 50. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Grouped Consumers C C C C
  • 51. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Grouped Consumers 0 1 2 3
  • 52. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Grouped Consumers 0 1 2 3
  • 53. Copyright 2021, ConïŹ‚uent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of ConïŹ‚uent, Inc. Consuming From Kafka - Grouped Consumers 0, 3 1 2 3
  • 54. Kafka’s Interceptors ProducerInterceptor onSend(ProducerRecord<K, V> record) Returns ProducerRecord<K, V> . Called from send() before key and value get serialized and partition is assigned. This method is allowed to modify the record. onAcknowledgement(RecordMetadata metadata, java.lang.Exception exception) This method is called when the record sent to the server has been acknowledged, or when sending the record fails before it gets sent to the server. Used for observability and reporting. ConsumerInterceptor onConsume(ConsumerRecords<K,V> records) Called just before the records are returned by KafkaConsumer.poll() This method is allowed to modify consumer records, in which case the new records will be returned. onCommit(Map<TopicPartition,OffsetAndMetada ta> offsets) This is called when offsets get committed. Used for observability and reporting Reference https://kafka.apache.org/20/javadoc/org/apache/kafka/clien ts/producer/ProducerInterceptor.html Reference https://kafka.apache.org/20/javadoc/org/apache/kafka/clien ts/consumer/ConsumerInterceptor.html
  • 55. Should I pool connections? NO! Since Kafka connections are long-lived, there is no reason to pool connections. It’s common to keep one connection per thread.
  • 56. Use a good client! Clients ● Java/Scala - default clients, comes with Kafka ● C/C++ - https://github.com/edenhill/librdkafka ● C#/.Net - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-dotnet ● Python - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-python ● Golang - https://github.com/conïŹ‚uentinc/conïŹ‚uent-kafka-go ● Node/JavaScript - https://github.com/Blizzard/node-rdkafka (not supported by ConïŹ‚uent!) New Kafka features will only be available to modern, updated clients!
  • 57. Resources Free E-Books from ConïŹ‚uent! I Heart Logs: https://www.conïŹ‚uent.io/ebook/i-heart-logs-event-data-stream-processing-and-data-integration/ Kafka: The DeïŹnitive Guide: https://www.conïŹ‚uent.io/resources/kafka-the-deïŹnitive-guide/ Designing Event Driven Systems: https://www.confluent.io/designing-event-driven-systems/ ConïŹ‚uent Blog: https://www.conïŹ‚uent.io/blog Thank You!