Benchmarking Kafka Performance for Different Use Cases Using Bench

Bench, a Framework for Benchmarking
Kafka Using k8s and OpenMessaging
Benchmark
Sky Kistler

Bench, a Framework
for Benchmarking
Kafka Using k8s and
OpenMessaging
Benchmark
Sky Kistler

Why are we
benchmarking
Kafka?

Repeatable,
easy to use
benchmarks

Bench is putting it all together
Tools of the trade
5
● Uses Kafka-native Java client
● Extendable and easily configurable
● Supports other systems such as
RabbitMQ and Redis
OpenMessaging Benchmark
● Reddit native compute environment
● Easily deployable
● Highly portable between cloud
environments
Containerized Environment
● Simplify compute instance
deployments for Kafka
● Quickly iterate with different instance
types, broker counts, and more
Terraform

6
How reddit uses Kafka and what
we’re looking to learn
● What Kafka configurations should we
recommend to these clients to optimize
the cost/performance trade-offs?
● Use case #1: App usage and ad telemetry
● Use case #2: Safety moderation of posts
and comments
● Use case #3: Change Data Capture of
application databases

Benchmark design in today’s data
● We aim to discover the maximum publish rate and byte
throughput in each scenario
● We want to understand end-to-end latency percentiles
● Each benchmark runs with 9 workers
○ 22 virtual cores each
● Increasing producer/consumer parallelism in increments of 10
○ 10 partitions, 10 producers, 10 consumers
○ 20 partitions, 20 producers, 20 consumers
○ etc…
● Random payloads
● Bench warms up traffic for us

Kafka cluster configurations
● KRaft cluster with 3 controllers, separate from the brokers
● Brokers are evenly distributed across 3 availability zones in us-east4
● Topic configuration:
○ replicas=3
○ minISR=2
○ Producers always have acks=all
○ No compression
● ZFS enabled
● Instance types explored:
○ n2-standard-16
○ n2-standard-8
○ c2-standard-16
● 6GB of RAM allocated to JVM heap

9
Use case #1:
App usage and ads telemetry
● Telemetry events are generated as
people browse reddit
● Millions of tiny messages per second in
production
○ Often 1KB in size or less
● Telemetry events include:
○ Upvotes/downvotes
○ Post and comment views
○ Ad delivery metrics

10
● 1KB message sized used for
simulation
● With 1KB messages, fewer
larger brokers tend to
marginally outperform
● This suggests that when
handling a high request rate,
larger brokers is preferable
Use case #1:
App telemetry
Broker count vs cores

11
Use case #1: App telemetry
How broker count and size affect tail latencies

12
● Compute-optimized nodes
tend to outperform
dollar-for-dollar when there is a
very high request rate
Use case #1:
App telemetry
Instance types

13
● When CPU is not constrained,
having more threads available to
handle requests performed
better
● As request count increases, CPU
cycle contention increases, and
having fewer threads than
number of cores tended to
perform much better
Use case #1:
App telemetry
Thread count

14
Use case #2:
Post and comment safety
moderation
● Posts and comments are continually
analyzed for safety and content policy
○ (think AutoMod)
● Messages vary in size with the size of user
content (images, text, video)
○ We will use 64KB random payloads to
simulate these messages
● Throughput is correlated with users
posting and commenting
○ Not as frequent as viewing and voting
events

15
● At 64KB messages, more brokers
with fewer cores tend to
outperform
● This is the opposite of what we
observed with the high request
rate scenario
● Total available network capacity
matters more as requests are
larger and take longer to process
Use case #2:
Post/comment analysis
Broker count vs size

16
Use case #2: Post/comment analysis

17
Use case #3:
Database change data capture
● Some analytical use cases process every
change to database tables in real time
○ Known as change data capture
● Messages can vary but tend to be much
larger
○ We will use 256KB random payloads
for simulation
● Even higher data throughput, lower
message rate

18
● At 256KB messages, we also
find more brokers with fewer
cores tend to marginally
outperform
● We hit the ceiling at a much
lower request rate
Use case #3:
Database change
data capture

19
Use case #3: Change data capture

20
What did we learn?
What do we do about it?

21
Key insights
● The cost of each cluster was roughly the same, but the performance of the
broker configuration is highly dependent on the use case
○ Workloads with very high request rate of small payloads tend to benefit
from more CPU resources per broker
○ Conversely, workloads with fewer, larger messages benefited from more
brokers with half as many cores per broker
● Increasing the number of threads past the number of cores has deleterious
effect on performance as message rate increases
○ Conversely, having enough cores per broker for each disk and network
thread has great performance benefits

22
Impact
● Use case based clusters are preferable over team/organization based clusters
○ Clusters with the same overall cost will perform better or worse depending on the
use case
● Partition counts and cluster configurations matter
○ Often ~50% higher performance when partitions could be evenly distributed
across the cluster
○ Huge variations in performance across workloads using different cluster settings
● When in doubt, test it!
○ Often our assumptions about what may or may not be relevant for a particular
scenario are challenged by the data

23
Check out the reddit
engineering blog!
r/RedditEng
We’re hiring!
redditinc.com/careers

Benchmarking Kafka Performance for Different Use Cases Using Bench

Benchmarking Kafka Performance for Different Use Cases Using Bench

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Benchmarking Kafka Performance for Different Use Cases Using Bench

Ähnlich wie Benchmarking Kafka Performance for Different Use Cases Using Bench (20)

Mehr von HostedbyConfluent

Mehr von HostedbyConfluent (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Benchmarking Kafka Performance for Different Use Cases Using Bench