"You might already use all known strategies to choose the right number of partitions for your newly created Apache Kafka topic. You apply the best recommendations to evenly distribute data across partitions in the topic. You even have metrics to observe and inform you on that. You do everything right.
But then reality happens.Despite best efforts, data is published unevenly, making it slow, expensive, and difficult to consume data from a topic. The future is full of unexpected impossible to predict events, and it doesn't care about rules or normal distributions.
This doesn't mean that we can simply disregard good practices. However, we need a plan for when things don't go according to anyone's calculations.
Come to this talk to learn what to do when the data distribution across topic partitions is badly broken and as a result significantly hurt consuming applications performances, increasing lag and slowing data processing.
We'll talk of existing strategies, including how you can replace an existing struggling topic with a new one and rebalance the data across new partitions using new rules. What dangers can happen and what to do when the state of keys is no longer guaranteed? Why is partition scaling considered to be a dangerous operation? We'll also look at this problem from the point of view of consumers, how to scale them to more partitions and what to keep in mind when using stateful systems.
This talk is for those who have sufficient expertise with Apache Kafka and want to bring their knowledge to the next level. However, we'll use simple language and accessible explanations, so even if you're a Kafka beginner, join this session to understand the challenges of uneven data replication and strategies to fix it."
4. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
5. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
6. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
➔ Be mindful of data distribution over time
7. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
➔ Be mindful of data distribution over time
➔ Consider potential edge cases
16. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
17. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
18. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
◆ Underutilisation of resources when vertical scaling with k8s
19. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
◆ Underutilisation of resources when vertical scaling with k8s
◆ Out-of-memory exception cycle
27. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
🌶🌶🌶 The advanced techniques will help you
● Rebalance records across partitions
● Scale your topic up or down
● Be effective at disaster recovery
35. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
36. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
37. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
- Data distribution over time
38. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
- Data distribution over time
- Linger_ms and batch_size for sticky partitioning
49. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
50. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
51. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
56. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
➔ Rebalance records across partitions
➔ Scale your topic up or down
57. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
➔ Rebalance records across partitions
➔ Scale your topic up or down
➔ To do disaster recovery
77. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Advantages
➔ No skipped messages
➔ Prevention of duplicates
➔ No need for extra compute to replicate data from old to new topic
78. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Limitations
➔ Downtime
➔ Difficult to test new setup and challenging to roll back
➔ Limited time window for migration
➔ Need for seamless collaboration among teams
➔ All-or-nothing migration
102. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Require too much resources if not simple enough
- Cannot keep up if it is too complicated
- Data losses if application is not reliable
- Data loss or duplicates because records from from different
partitions get shuffled
WARNING. Records/keys almost certainly will be mixed.
128. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
If consumers had stopped when order is not correct.
- Read some records one more time
OR
- Skip some records
154. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translation
For r in records:
P = r.metadata.old_partition
If offsets[P] <= r.metadata.offset:
return
Consumer Group 1:
Partition 0: offset 13
Partition 2: offset 23
155. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Mirror Maker offset translation
Data pump -> MirrorSourceTask
Old + New records metadata -> Records in Offset Sync topics
Offset translation -> MirrorCheckpointTask
Problems:
- Main usecase data transfer between 2 clusters, not a same
- Till version 3.3 offset translation by measuring the 'distance' between
the MM2 offset sync and the upstream consumer group, and then
assuming that the same distance applies in the downstream topic.
156. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Spikes
- Too long downtime for consumers
- Data loss or duplicates
- Poor offsets estimations
- Bad timing for offsets translation
168. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Key learnings
● No keys - add partitions 🌶
● A few hot keys - you still can add partitions 🌶🌶
● Workarounds are not sufficient? - Migrate the topic 🌶🌶🌶
169. olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Migrate the topic 🌶🌶🌶
● Sharp cut - stop the producers first
○ Exactly once delivery
○ Expect the downtime
○ All-or-nothing migration
● Generic gradual switch
○ Minimal downtime
○ Possibility to test before switching
○ Switch consumer groups gradually
○ Minimize chance of duplicates