Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and Michael Noll, Confluent) Kafka Summit 2020

Trade-offs in Distributed Systems Design:
Is Kafka The Best?
Ben Stopford, Michael G. Noll
Ofﬁce of the CTO, Conﬂuent Inc
Kafka Summit Austin 2020 @ August 24-25, 2020

Trade-offs in Infrastructure Design:
‘Better’ is always subjective
2
Impressing
your friends
Taking the kids
to school

Benchmark comparison of
• Kafka (Log)
• RabbitMQ (Classical Messaging)
• Pulsar (BookKeeper derivative)
This chart shows the results for:
• Maximum steady-state throughput
using the Open Messaging Benchmark
on identical 3-node clusters.
• Equal Produce/Consume workload.
• Full details available at
https://www.conﬂuent.io/blog/kafka-fast
est-messaging-system/
Impact of trade-offs is tangible
3
605MB/s
305MB/s
38MB/s

Genesis of Messaging Systems
4
2000 2010 2020
Early Messaging, JMS & later AMQP (1990’s onwards)
● Design: Message / Channel / Single machine.
● ActiveMQ ﬁrst open source. Later HornetQ added.
● RabbitMQ built for AMQP.
● Many others (NATS, Aeron, ZeroMQ…)
AWS Kinesis (2013)
● Kafka-like design
● Limited relative
performance
● Novel shared-service
design
Azure Event Hubs (2014)
● Kafka-like design
● Implements AMQP
(difﬁcult as pre-streaming
protocol)
Early Messaging Era Event Streaming Era
Kafka (2012)
● 1st Event Streaming system (distributed in all layers)
● Designed primarily for ‘events’ using the log abstraction
● Departs from messaging world by including scalable
storage and processing
Bookkeeper derivatives
(2016+)
● Distributed Log (2016)
● Pulsar (2018)
● Pravega (2018)
● All “caching tiers”++
built over Bookkeeper.
BookKeeper (2011)
● Goal: write-ahead-log for
Hadoop HDFS NameNode
(ultimately not used).
● 2011 BookKeeper released
as part of ZooKeeper

Messaging Model Basics
5
Unordered + most
recent delivery,
msg-level ack
Ordered + partitioned
delivery for parallelism
(consumer group)
Point-to-Point Channel, 1 consumer
(ordered)
Point-to-Point, many competing consumers
(unordered)
Publish-Subscribe, many individual consumers
(ordered, same dataset to each)
Event Streaming, many partitioned consumers
(ordered, partitioned consumption)

Messaging Model Basics
6
Suits Classical Messaging: single machine, message-oriented
Suits Event Streaming: distributed, data-oriented (events)

Trade-offs in
Distributed System Design

Contiguous Streams
vs.
Fragmented Streams

Contiguous Streams vs. Fragmented Streams
Trade-off: Little vs. Lots of Metadata // Navigational Simplicity vs. Even Storage Distribution
Log-based Approach (Kafka):
partition data is contiguous, on 1 node
BookKeeper derivatives (DistributedLog, Pulsar, etc.):
partition data is fragmented, spread across N nodes
Pros:
● Fast reads and writes
(Quick navigation. Data
locality.)
● Little metadata (what is
where?):
p1[r0,r1,r2]
● Makes it easier to
remove ZK, where
metadata is stored
Cons:
● Network indirection.
● Lots of metadata (what is
where?) everywhere to keep
consistent, cache locally, etc.:
p1[r0[0,10], r1[11,22],
r2[23,45], r1[46,47],
r3[48,50], r0[51,54],
r2[55,58], …]
● Slow recovery of lost data
Cons:
● Storage unevenly
distributed, if using
key-based partitioning
● Partition must ﬁt on
one machine (without
tiered storage)
Pros:
● Storage distributed more
evenly
● Partition can span
multiple machines
● Also useful to let new
machines accept writes
immediately
9

Log-based storage
vs.
Index-based storage

Sequential Access vs. Random Access
11
Log-based Approach (Kafka):
Contiguous storage per partition
Classical Approach (Rabbit, ActiveMQ, BookKeeper derivatives):
Interleaved entries for many partitions in one file
Index
(KahaDB,
LevelDB,
RocksDB,
etc.)
Trade-off: Log-based storage vs. Index-based storage
P1
P2
All Partitions
Fetch messages for
partition 2.
Fetch messages for
partition 2.
Pros
● Uses contiguous
operations that allow
fast reads and writes
Cons
● Number of partitions
limited by file handles
Pros
● Good write performance
● Number of partitions not
limited by file handles
Cons
● Slower read performance
● Indexing overhead

Single Tier
vs.
Separate Tiers

● Single tiers are great as they make our systems simpler, efﬁcient, easier to build and to use.
● Adding a tier is no free lunch. Upsides should outweigh the downsides.
Single Tier vs. Multiple Tiers
Trade-off: Efﬁciency of a Single Tier vs. Independence of Separate Tiers
☺ Simple is beautiful
13
Kafka Core

● Such simplicity is great. That’s why many of us look forward to Kafka without ZooKeeper!
● But Kafka’s relation to ZooKeeper is not really about tiering, so we cover it in the next section.
Kafka Core ZooKeeper
☹ But not really a ‘tier’!
14

● Tiering can make sense, e.g. as you enhance your system with other systems.
● For example, when the tiers should be scaled independently.
Kafka Core
ksqlDB
☺
CPU bound
IO/network bound
15

● Pulsar is ‘caching’ over BookKeeper (read performance, read elasticity).
● Much like memcached can add caching to PostgreSQL.
Pulsar (caching)
BookKeeper (storage) PostgreSQL (storage)
memcached (caching)
Would you add memcached over Kafka?
16

● Adding a caching tier to Kafka?
● Probably not, because of cost ($$$) as layers aren’t free, and Kafka is already faster!
17
=Kafka
broker
Kafka
broker
Kafka
broker
Kafka
broker
Pulsar
broker
BK bookie BK bookie
Pulsar
broker

Better: add Tiered Storage (KIP-405)
18
Kafka Core (hot data)
Tiered Storage (cold data)
● Already elastic (e.g. AWS S3)
● Unlimited storage
● Cheaper storage
● Scale-in/out requires movement
of active segments only
● Biggest challenge for elasticity is moving large quantities of cold data
● Tiered storage eliminates the expensive data-intensive move operations needed for
scale-in/out.

● Kafka is already faster for hot data (cf. benchmark).
● Tiered storage adds elasticity with cold data tiered.
● In a Cloud-native architecture ⇒ The BookKeeper layer becomes redundant.
Kafka Core (hot data) Pulsar (hot data)
BookKeeper (cold data)
Tiered Storage (cold data) Tiered Storage (cold data)
Redundant?
19

● Conﬂuent Cloud provides a great example of Kafka’s elasticity
● Scales from 0 to 100 Megabytes/s and down near-instantaneously
● Unlimited data storage
20
User never has to ‘resize’ a
cluster because there are
no brokers or servers to manage

‘It Just Works’
vs.
Flexibility of Many Parts

Integrated vs. Portfolio Solution
Trade-off: ‘It Just Works’ vs. Flexibility of a Multi-part Setup
Image credit: Apple Image credit: Conﬂuent #gamers channel
● Just works
● Expensive to build.
● Faster time-to-market
● Integration issues
22

● Portfolio makes sense when there are separate concerns, and you want to deploy them
independently.
Kafka Core
☺
Kafka
Connect
Kafka
Connect
Kafka
Connect
Kafka
Connect
Finance
team
InfoSec
team
Ops
team
Your
team
23

● Portfolio of ‘Kafka + ZooKeeper’ gave the Kafka project fast time-to-market in 2012.
● By 2020, however, Kafka and the needs of its users have changed.
Kafka Core
ZooKeeper
☹ ZK is always required
by Kafka until KIP-500
24

After KIP-500, Kafka is
self-sufﬁcient (no ZK
needed)
● Portfolio of ‘Kafka + ZooKeeper’ gave the Kafka project fast time-to-market in 2012.
● By 2020, however, Kafka and the needs of its users have changed.
● KIP-500 replaces ZK with integrated Kafka functionality for ‘It Just Works’.
● Removes e.g. scalability limitations like max number of partitions in a Kafka cluster
Kafka Core
☺
25

“Integrated” is always better (for the user), but it’s more expensive to build.
26
VS.

Summary
28
“It’s not right or wrong. It’s trade-offs!”
● Kafka
○ Log-based approach provides top-of-class performance with low-overhead reads and writes.
○ Conﬂuent Cloud is the most complete, cloud-native Kafka offering on the cloud.
● RabbitMQ, ActiveMQ
○ Designed for short-lived messaging, where data is quickly removed after it is consumed.
● BookKeeper derivatives: Distributed Log, Pulsar, Pravega
○ Has elements of log-based storage, but inherits some limitations of traditional messaging (e.g., disk and
segment fragmentation).
● AWS Kinesis, Azure Event Hubs:
○ Limited performance compared to Kafka (anecdotal).
○ Novel cloud-native designs, but little known about internal implementations.
● Lots we didn’t mention: scale up vs. scale out, transactional messaging, …
○ See the associated blog for all the details.
“For event streaming, Kafka remains the best.
The most mature. The largest ecosystem.”

Thank you!
@benstopford
@miguno

Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and Michael Noll, Confluent) Kafka Summit 2020

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and Michael Noll, Confluent) Kafka Summit 2020

Ähnlich wie Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and Michael Noll, Confluent) Kafka Summit 2020 (20)

Mehr von HostedbyConfluent

Mehr von HostedbyConfluent (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and Michael Noll, Confluent) Kafka Summit 2020