Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink

1
Tzu-Li (Gordon) Tai
@tzulitai
tzulitai@apache.org
Flink Meetup @ Idealo GmbH
Stateful Stream Processing
with Apache Flink

2
Original creators of Apache
Flink®
Providers of the
dA Platform, a supported
Flink distribution

This will be about ...
▪ Why and what is stateful stream processing?
▪ How Flink enables stateful stream processing:
Runtime & API
▪ How Flink has progressed w.r.t. stateful stream
processing
3

The ol’ traditional batch way
5
t
Batch
jobs
...
● Continuously
ingesting data
● Time-bounded
batch files
● Periodic batch
jobs

The ol’ traditional batch way (II)
6
● Consider computing
conversion metric
(# of A → B per hour)
● What if the conversion
crossed time
boundaries?
→ carry intermediate
results to next batch
● What if events come
out of order?
→ ???
t
...

The ideal way
7
accumulate state
● View of the “history” of the
input stream
● counters, in-progress
windows
● parameters of incrementally
trained ML models, etc.
● State influences the output
● Output depending on
notions of time
● Outputs when results
are complete.
long running computation

The ideal way (II)
8
Stateful Stream Processor
that handles
consistently, robustly, and efficiently
Large
Distributed State
Time / Order /
Completeness
● Stateful stream processing as
a new paradigm to
continuously process
continuous data
● Produce accurate results
● Results available in real-time
is only a natural consequence
of the model
servicing

Stream Processing
9
Your
Code
process records
one-at-a-time
...
Long running computation, on an endless stream of input

Distributed Stream Processing
10
Your
Code
...
...
...
Your
Code
Your
Code
● partitions input streams by
some key in the data
● distributes computation
across multiple instances
● Each instance is responsible
for some key range

11
...
...
Your
Code
Your
Code
Process
update local
variables/structures
var x = …
if (condition(x)) {
…
}

12
...
...
Your
Code
Your
Code
Process
update local
variables/structures
var x = …
if (condition(x)) {
…
}
● embedded local state
backend
● State co-partitioned with
the input stream by key

State fault tolerance
13
Fault tolerance concerns for a stateful stream processor:
● How to ensure exactly-once semantics for the state?

Fault tolerance: simple case
14
event log
single process
main memoryperiodically snapshot
the memory

Fault tolerance: simple case
15
Recovery: restore
snapshot and replay
events since snapshot
event log
persists events
(temporarily)
single process

State fault tolerance
16
Fault tolerance concerns for a stateful stream processor:
● How to ensure exactly-once semantics for the state?
● How to create consistent snapshots of distributed embedded state?
● More importantly, how to do it efficiently without abrupting computation?

State fault tolerance (II)
17
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
● Consistent snapshotting:

State fault tolerance (II)
18
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
File System
Checkpoint
● Consistent snapshotting:

State fault tolerance (III)
19
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
File System
Restore
● Recover all embedded state
● Reset position in input stream

State management
20
Two important management concerns for a long-running job:
● Can I change / fix bugs in my streaming pipeline?
How do I handle job downtime?
● Can I rescale (change parallelism of) my computation?

About time ...
21
...
...
Your
Code
Your
Code
When should results be emitted?
● Control for determining when the computation has
fully processed all required events
● It’s mostly about time. e.g. have I received all
events for 3 - 4 pm?
● Did event B occur within 5 minutes from event A?
● Wall clock time is not correct. Event-time
awareness is required.

So, in a nutshell ...
22
● State fault tolerance: exactly-once semantics, distributed state snapshotting
● Extremely large state (TB+ scale): out-of-core state, efficiently snapshotting large state
● State management: handling state w.r.t. job changes & rescales
● Event-time aware: controlling when results are complete and emitted w.r.t. event-time

So, in a nutshell ...
23
● State fault tolerance: exactly-once semantics, distributed state snapshotting
● Extremely large state (TB+ scale): out-of-core state, efficiently snapshotting large state
● State management: handling state w.r.t. job changes & rescales
● Event-time aware: controlling when results are complete and emitted w.r.t. event-time
Apache Flink, as a stateful stream processor,
is fundamentally designed to address and provide
exactly these ;)

Apache Flink Stack
25
DataStream API
Stream Processing
DataSet API
Batch Processing
Runtime
Distributed Streaming Data Flow
Libraries
Streaming and batch as first class citizens.

Programming Model
26
Computation
Computation
Computation
Computation
Source Source
Sink
Sink
Transformation
state
state
state
state

API and Execution
27
Source
DataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));
DataStream<Event> events = lines.map(line -> parse(line));
DataStream<Statistic> stats = stream
.keyBy("id")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction());
stats.addSink(new BucketingSink(path));
map()
[1]
keyBy()/
window()/
apply()
[1]
Transformation
Transformation
Sink
Streaming
DataflowkeyBy()/
window()/
apply()
[2]
map()
[1]
map()
[2]
Source
[1]
Source
[2]
Sink
[1]

Levels of abstraction
29
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
low-level (stateful
stream processing)
stream processing &
analytics
declarative DSL
high-level langauge

Process Function Blueprint
30
class MyFunction extends ProcessFunction[MyEvent, Result] {
<STATE DECLARATIONS>
<PROCESS>
<ON TIMER CALLBACK>
}

State Declarations
31
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
<PROCESS>
<ON TIMER CALLBACK>
}

Process Element / Fire Timers
32
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
<ON TIMER CALLBACK>
}

Timer Callback
33
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = {
// handle callback when event-/processing- time instant is reached
}
}

Distributed Snapshots
34
Coordination via markers, injected into the
streams

Distributed snapshots
35
State index
(Hash Table
or
RocksDB)
Events flow without replication or synchronous writes
stateful
operation
source

36
Trigger checkpoint
Inject checkpoint
barrier
stateful
operation
source

37
stateful
operation
source
Take state
snapshot
Trigger state
copy-on-write

38
stateful
operation
source
Durably persist
snapshots
asynchronously
Processing pipeline continues

Time: Watermarks
39
● Also a coordination of
special markers, called
Watermarks
● A watermark of
timestamp t indicates
that no records with
timestamp < t should
be expected

Time: Watermarks in parallel
40

State management: Savepoints
41
▪ A persistent snapshot of all state
▪ When starting an application, state can be initialized
from a savepoint
▪ In-between savepoint and restore we can update Flink
version or user code

42
▪ No stateless point-in-time

43
▪ Before downtime, take a savepoint of the job

44
▪ Recover job with state, catch up with event-time processing
with event-time awareness, allows
faster-than-real-time reprocessing

Rescaling State / Elasticity
▪ Similar to consistent hashing
▪ Split key space into key
groups
▪ Assign key groups to tasks
Key
space
Key group #1 Key group #2
Key group
#3
Key group #4

Rescaling State / Elasticity
▪ Rescaling changes key
group assignment
▪ Maximum parallelism
defined by #key groups

Evolution of Programming APIs
48

Evolution of Large State Handling
49

TL; DR
51
● Stateful stream processing as a paradigm for continuous
data
● Apache Flink is a sophisticated and battle-tested
stateful stream processor with a comprehensive set of
features
● Efficiency, management, and operational issues for state
are taken very seriously

The application hierarchy
52
Event-driven applications
(event sourcing, CQRS)
Stream Processing / Analytics
Batch Processing
Stateful, event-driven,
event-time-aware
processing
(streams, windows, …)
(data sets)
further reading: see “Building a Stream Processor for
Event-Driven Applications” by Stephan Ewen

5
Thank you!
@tzulitai
@ApacheFlink
@dataArtisans

We are hiring!
data-artisans.com/careers

Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink

Ähnlich wie Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink (20)

Mehr von Ververica

Mehr von Ververica (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink