As Apache Flink continues to push the boundaries of stateful stream processing as an integral part of its past releases, increasing numbers of users are starting to realize the potential of stateful stream processing as a promising paradigm for robust and reactive data analytics as well as event-driven applications.
This talk aims at covering the general idea and motivations of stateful stream processing, and how Flink enables it with its powerful set of state management features and programming APIs. In addition to that, we will also take a look at the recent advancements related to Flink's state management and large state handling that were driven by our team at data Artisans team in the latest version 1.3 (expected release by end of May / early June).
2. 2
Original creators of Apache
Flink®
Providers of the
dA Platform, a supported
Flink distribution
3. This will be about ...
▪ Why and what is stateful stream processing?
▪ How Flink enables stateful stream processing:
Runtime & API
▪ How Flink has progressed w.r.t. stateful stream
processing
3
5. The ol’ traditional batch way
5
t
Batch
jobs
...
● Continuously
ingesting data
● Time-bounded
batch files
● Periodic batch
jobs
6. The ol’ traditional batch way (II)
6
● Consider computing
conversion metric
(# of A → B per hour)
● What if the conversion
crossed time
boundaries?
→ carry intermediate
results to next batch
● What if events come
out of order?
→ ???
t
...
7. The ideal way
7
accumulate state
● View of the “history” of the
input stream
● counters, in-progress
windows
● parameters of incrementally
trained ML models, etc.
● State influences the output
● Output depending on
notions of time
● Outputs when results
are complete.
long running computation
8. The ideal way (II)
8
Stateful Stream Processor
that handles
consistently, robustly, and efficiently
Large
Distributed State
Time / Order /
Completeness
● Stateful stream processing as
a new paradigm to
continuously process
continuous data
● Produce accurate results
● Results available in real-time
is only a natural consequence
of the model
servicing
13. State fault tolerance
13
Fault tolerance concerns for a stateful stream processor:
● How to ensure exactly-once semantics for the state?
14. Fault tolerance: simple case
14
event log
single process
main memoryperiodically snapshot
the memory
15. Fault tolerance: simple case
15
Recovery: restore
snapshot and replay
events since snapshot
event log
persists events
(temporarily)
single process
16. State fault tolerance
16
Fault tolerance concerns for a stateful stream processor:
● How to ensure exactly-once semantics for the state?
● How to create consistent snapshots of distributed embedded state?
● More importantly, how to do it efficiently without abrupting computation?
17. State fault tolerance (II)
17
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
● Consistent snapshotting:
18. State fault tolerance (II)
18
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
File System
Checkpoint
● Consistent snapshotting:
19. State fault tolerance (III)
19
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
File System
Restore
● Recover all embedded state
● Reset position in input stream
20. State management
20
Two important management concerns for a long-running job:
● Can I change / fix bugs in my streaming pipeline?
How do I handle job downtime?
● Can I rescale (change parallelism of) my computation?
21. About time ...
21
...
...
Your
Code
Your
Code
When should results be emitted?
● Control for determining when the computation has
fully processed all required events
● It’s mostly about time. e.g. have I received all
events for 3 - 4 pm?
● Did event B occur within 5 minutes from event A?
● Wall clock time is not correct. Event-time
awareness is required.
22. So, in a nutshell ...
22
● State fault tolerance: exactly-once semantics, distributed state snapshotting
● Extremely large state (TB+ scale): out-of-core state, efficiently snapshotting large state
● State management: handling state w.r.t. job changes & rescales
● Event-time aware: controlling when results are complete and emitted w.r.t. event-time
23. So, in a nutshell ...
23
● State fault tolerance: exactly-once semantics, distributed state snapshotting
● Extremely large state (TB+ scale): out-of-core state, efficiently snapshotting large state
● State management: handling state w.r.t. job changes & rescales
● Event-time aware: controlling when results are complete and emitted w.r.t. event-time
Apache Flink, as a stateful stream processor,
is fundamentally designed to address and provide
exactly these ;)
25. Apache Flink Stack
25
DataStream API
Stream Processing
DataSet API
Batch Processing
Runtime
Distributed Streaming Data Flow
Libraries
Streaming and batch as first class citizens.
29. Levels of abstraction
29
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
low-level (stateful
stream processing)
stream processing &
analytics
declarative DSL
high-level langauge
30. Process Function Blueprint
30
class MyFunction extends ProcessFunction[MyEvent, Result] {
<STATE DECLARATIONS>
<PROCESS>
<ON TIMER CALLBACK>
}
31. State Declarations
31
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
<PROCESS>
<ON TIMER CALLBACK>
}
32. Process Element / Fire Timers
32
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
<ON TIMER CALLBACK>
}
33. Timer Callback
33
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = {
// handle callback when event-/processing- time instant is reached
}
}
39. Time: Watermarks
39
● Also a coordination of
special markers, called
Watermarks
● A watermark of
timestamp t indicates
that no records with
timestamp < t should
be expected
41. State management: Savepoints
41
▪ A persistent snapshot of all state
▪ When starting an application, state can be initialized
from a savepoint
▪ In-between savepoint and restore we can update Flink
version or user code
44. State management: Savepoints
44
▪ Recover job with state, catch up with event-time processing
with event-time awareness, allows
faster-than-real-time reprocessing
45. Rescaling State / Elasticity
▪ Similar to consistent hashing
▪ Split key space into key
groups
▪ Assign key groups to tasks
Key
space
Key group #1 Key group #2
Key group
#3
Key group #4
46. Rescaling State / Elasticity
▪ Rescaling changes key
group assignment
▪ Maximum parallelism
defined by #key groups
51. TL; DR
51
● Stateful stream processing as a paradigm for continuous
data
● Apache Flink is a sophisticated and battle-tested
stateful stream processor with a comprehensive set of
features
● Efficiency, management, and operational issues for state
are taken very seriously
52. The application hierarchy
52
Event-driven applications
(event sourcing, CQRS)
Stream Processing / Analytics
Batch Processing
Stateful, event-driven,
event-time-aware
processing
(streams, windows, …)
(data sets)
further reading: see “Building a Stream Processor for
Event-Driven Applications” by Stephan Ewen