This document discusses dynamic scaling in Apache Flink. It describes Flink's approach to dynamically scaling stateful jobs to adapt to changing workloads. Key points include: repartitioning of keyed and non-keyed state when scaling workers, supporting manual rescaling through savepoints and restarts currently, and future work on scaling operators without restarts and implementing automatic scaling policies.
10. Keyed vs. Non-keyed State
10
• State bound to a key
• E.g. Keyed UDF and window state
• State bound to a subtask
• E.g. Source state
Keyed Non-keyed
11. Repartitioning Keyed State
§ Similar to consistent
hashing
§ Split key space into
key groups
§ Assign key groups to
tasks
11
Key space
Key group #1 Key group #2
Key group #3Key group #4
12. Repartitioning Keyed State contd.
§ Rescaling changes
key group
assignment
§ Maximum parallelism
defined by #key
groups
12
13. Repartitioning Non-keyed state
§ User defined merge and
split functions
• Most general approach
§ Breaking non-keyed state
up into finer granularity
• State has to contain
multiple entries
• Automatic repartitioning
wrt granularity
13
#1 #2
#3
14. Repartitioning Non-keyed State contd.
§ Non-keyed state entries gathered at the
job manager
§ Repartitioning schemes
• Repartition & send
• Union & broadcast
14
15. Example: Kafka Source
15
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
• Store offset for each partition
• Individual entries are repartitionable
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
16. Rescaling: Why is That so Hard?
§ Handling of state
§ Repartitioning of keyed & non-keyed
state
§ Unique among open source stream
processors, afaik
16
20. Current State
§ Manual rescaling
1. Take savepoint
2. Stop the job
3. Restart job with adjusted parallelism and
savepoint
20
21. Next Steps
§ Integrate savepoint with stop signal
§ Rescaling individual operators w/o restart
§ Dynamic container de-/allocation
• “Running Flink Everywhere” by Stephan
Ewen, 16:45 at Kesselhaus
21
22. Auto Scaling Policies
22
• Latency
• Throughput
• Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-
autoscale) already support auto-scaling
23. Conclusion
§ Scaling of keyed and non-keyed state
§ Flink supports manual rescaling with
restart
(WIP branch: https://github.com/tillrohrmann/flink/tree/partitionable-op-state)
§ Future versions might support scaling on
the fly and automatic rescaling policies
23