Kafka is an open-source distributed commit log addressing low latency, high throughput, scalability, fault-tolerance, and disk-based retention. It can be used to build tracking systems, messaging systems, high performance streaming platforms, real-time analysis, audit log….you name it. In our case, it’s been used to build a scalable event-store and messaging platform that stores billions of messages.
In this talk, we’re taking a closer look at essential Kafka concepts such as partition rebalancing, offset management, replication, producer/broker request fetching, file segments, etc. to understand what makes Kafka so scalable, resilient, performant and fault tolerant. We will also touch upon Kafka transactions to know what they are and how to leverage them. Last but not the least we will highlight some potential pitfalls that one should watch out when going to production with Kafka.
14. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Support both Point-to-point & publish-subscribe (consumer group generalises
this concept).
Kafka - more than a message queue
15. #DevoxxPL @nklmish
Each topic in Kafka has its own journal
#DevoxxBE @nklmish
Producer
Support both Point-to-point & publish-subscribe
Partition-0 Partition-1 Partition-2
18. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Support both Point-to-point & publish-subscribe (consumer group generalises
this concept).
• Highly Scalable, available & durable.
• Kafka Connect interface - pull & push for data.
• Architecture inherits more from storage system like HBase, Cassandra, HDFS
vs traditional message system.
• Stronger ordering guarantees than a traditional messaging system
Kafka - more than a message queue
[1]: New Relic 15 million messages/sec
[2]: LinkedIn 1.1 Trillion messages/day
[3]: Netflix 2 Trillion messages/day at Peak
19. #DevoxxPL @nklmish#DevoxxBE @nklmish
Ordering guarantees, traditional messaging system
R0 R1 R2 R3 …
C0
C1
C2
Parallelconsumption
Server side: Queue retains records in-order on the server
Async delivery
R0, t=1
R2, t=2
R1, t=0
M
essaging
system
solves
this
using
“exclusive
consum
er”
20. Kafka is better than MOM
#DevoxxPL @nklmish#DevoxxBE @nklmish
21. I didn’t mean my Mom ;)
#DevoxxPL @nklmish#DevoxxBE @nklmish
22. #DevoxxPL @nklmish#DevoxxBE @nklmish
MOM vs.
Kafka
Broker Centric Approach Client Centric Approach
Index structures (Btree or Hash Tables) Log structured
Retention impacts performance Designed for Retention
Outrage: Significant slow down
Outrage: won’t cause infrastructure to
slow down significantly
[1]: Large Queue Depth & Performance Problem
32. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Doesn’t :
• Even import them in JVM
• Buffer messages to user space
• Kernel level IO, copies directly from disk buffer to socket
Java#transferTo
Massive throughput - comes from Log
34. Exactly Once Processing, comes from Kafka Transaction
BankTransfer
Service
X
Consume
Y
Produce
#DevoxxPL @nklmish
Exactly once delivery, definitive design
#DevoxxBE @nklmish
KIP-101
KIP-98
45. #DevoxxPL @nklmish
A
Data logs
B
C
Tx-log
Chandy & Lampot, marker messages
#DevoxxBE @nklmish
Kafka Transaction
m0
m1
m2
m3
m4
m5
m6
Producer
Transaction
Coordinator
init(t.id)
t.id -> ongoing
(Internal-Kafka-topic), RF =3
46. #DevoxxPL @nklmish
A
D
Data logs
B
C
Tx-log
Chandy & Lampot, marker messages
#DevoxxBE @nklmish
Kafka Transaction
m0
m1
C
m2
m3
m4
C
m5
m6
C
Producer
Transaction
Coordinator
init(t.id)
t.id -> ongoing
t.id -> prepare
t.id -> committed
(Internal-Kafka-topic), RF =3
47. #DevoxxPL @nklmish
A
D
Data logs
B
C
Tx-log
Chandy & Lampor, marker messages
#DevoxxBE @nklmish
Kafka Transaction
m0
m1
C
m2
m3
m4
C
m5
m6
C
Producer
Transaction
Coordinator
init(t.id)
t.id -> ongoing
t.id -> prepare
t.id -> committed
Consum
er:
isolation.level =
read_com
m
itted
(Internal-Kafka-topic), RF =3
48. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Rock solid replication protocol and leader election process.
• Relies on Replication to avoid sync calls!!
Resiliency - comes from Replication
[1]: PacificA
52. #DevoxxPL @nklmish
Buffer Cache
Filesystem
writes “Foo”
to disk
Written to
Disk drive
controller
cache
Moves to
Caches
Disk Caches
Moves to
Hard disk
#DevoxxBE @nklmish
Write path
53. #DevoxxPL @nklmish
Buffer Cache
Filesystem
writes “Foo”
to disk
Written to
Disk drive
controller
cache
Moves to
Caches
Hardware-level cache
(write-through or write-back)
Disk Caches
Hard disk
Moves to
#DevoxxBE @nklmish
Write path
54. #DevoxxPL @nklmish
Buffer Cache
Filesystem
writes “Foo”
to disk
Written to
Disk drive
controller
cache
Moves to
Caches
Hardware-level cache
(write-through or write-back)
Disk Caches
Moves to
Finally
written
to disk
Hard disk
#DevoxxBE @nklmish
Write path
55. Replication
P0 P0 P0
Kafka cluster
Leader
A B C
Kafka partition == Replicated Log
#DevoxxBE @nklmish
Producer
P0
56. Replication
P0 P0 P0
Kafka cluster
Leader
A B C
Kafka partition == Replicated Log
Partition: P0
Leader: A
ISR: [B, C]
#DevoxxBE @nklmish
Producer
P0
57. Replication
P0 P0 P0
Kafka cluster
Leader
A B C
Kafka partition == Replicated Log
Partition: P0
Leader: A
ISR: [B, C]
#DevoxxBE @nklmish
Producer
P0
Slow
58. Replication
P0 P0 P0
Kafka cluster
Leader
A B C
Kafka partition == Replicated Log
Partition: P0
Leader: A
ISR: [B]
#DevoxxBE @nklmish
Producer
P0
Slow
78. Business logic
Kafka Stream
WalletService
Kafka Streams
#DevoxxBE @nklmish
Bob:+100 Doe:-100 Bob:+200 Bob:+400 Lee:+400 Bob:-200 Jen:+200 Doe:+200 Jen:+100
1. Default state store, rockDB
2. Local writes to event store -> pushed back to Kafka
State store
79. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Be conscious, when setting deletion policy -> (Integer.MAX_VALUE) is not good ;)
• Shared storage, remember neighbours can be chatty.
• Avoid network throttling & instability, enforce Quotas.
• Adding new broker != automatic data partitions assignment for existing topic,
use bin/kafka-reassign-partitions.sh
• Aggregate (Stateful processing)-> use compacted topic -> avoid loading whole
versioned history.
• Slow controlled shutdown, don’t use kill command -> upgrade to Kafka 1.1.0 :)
What we learned in the process?
80. #DevoxxPL @nklmish#DevoxxBE @nklmish
• Kafka is always on
• Designed from ground up for distributed world
• Kafka - a database inside out
• Its fun to play lego
How Kafka helped us go through the 3 challenges?