"Do you know how your data moves into and out of your Apache Kafka® instance? From the programmer’s point of view, it’s relatively simple. But under the hood, writing to and reading from Kafka is a complex process with a fascinating life cycle that’s worth understanding.
When you call producer.send() or consumer.poll(), those calls are translated into low-level requests which are sent along to the brokers for processing. In this session, we’ll dive into the world of Kafka producers and consumers to follow a request from an initial call to send() or poll(), all the way to disk, and back to the client via the broker’s final response. Along the way, we’ll explore a number of client and broker configurations that affect how these requests are handled and discuss the metrics that you can monitor to help you to keep track of every stage of the request life cycle.
By the end of this session, you’ll know the ins and outs of the read and write requests that your Kafka clients make, making your next debugging or performance analysis session a breeze."
12. dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/
Batching Configurations
● batch.size
○ Default 16 KB (no batching when set to 0)
○ Batches may not be full
● linger.ms
○ Default 0 ms (no batching when set to 0)
○ Directly affects latency, e.g. linger.ms=10 adds up to 10 ms of latency
● buffer.memory
○ Default ~32 MB
○ Should be > batch.size
○ Chunked into segments of batch.size
16. dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/
Request Configurations
● max.request.size
○ Default ~1 MB
○ Limits number of record batches
● acks
○ Default “all”
○ How many replicas should write the data before sending a response back?
● max.in.flight.requests.per.connection
○ Default 5
○ Limit on unacknowledged requests per broker
● enable.idempotence and transactional.id
● request.timeout.ms
○ Default 30 seconds
○ Time between issuing request and retrying or throwing exception
○ Retrying set handled by delivery.timeout.ms, retries, and retry.backoff.ms
18. dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/
Produce Request (Bound by max.request.size)
Request Metadata
- transaction ID
- acks
- timeoutMs
Producer Data
Topic Data
dwarf_updates Partition_1
index
Batch
Topic Data
hobbit_updates Partition_0
index
Batch Batch
Partition_4
index
Batch
27. dfine@confluent.io @TheDanicaFine linkedin.com/in/danica-fine/
● Hold request until data is has been
replicated as per acks
● Based on hierarchical timing wheel
● Configure:
○ default.replication.factor
○ num.replica.fetchers
○ replica.fetch.wait.max.ms
● Monitor using RemoteTimeMs
6: Mordor Purgatory (but actually)