SlideShare a Scribd company logo
1 of 82
Download to read offline
1
Stefan Richter

@stefanrrichter



29.10.2016
A look at Flink 1.2 and beyond
Agenda
▪ Flink 1.2 feature overview & walkthrough
▪ Taking a closer look at two features:
▪ Queryable state
▪ Dynamic scaling
2
Feature Overview
Flink Release 1.2
3
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application

Features
Broader

Audience
Flink 1.2 Improvements
5
Session Windows(Stream) SQL
Library

enhancements
Metric

System
Operations
Ecosystem
Application

Features
Metrics &

Visualization
Dynamic Scaling
Savepoint

compatibility Checkpoints

to savepoints
Connectors in Flink
Stream SQL

Windows
Large state

Maintenance
Fine grained

recovery
Side in-/outputs
Window DSL
Broader

Audience
Security
Mesos &

others
Dynamic Resource

Management
Authentication
Queryable StateApache Bahir connectors
Security / Authentication - Flink 1.2
6
Authorized data access
Secured clusters with Kerberos-based authentication
• Kafka, ZooKeeper, HDFS, YARN, HBase, …
Encrypted traffic between Flink Processes
• RPC, Data Exchange, Web UI, … - „SSL for all connections“
Largely contributed by
Prevent malicious users to hook into Flink jobs
Cluster Management - Flink 1.1
7
Standalone
Flink on Yarn
Cluster Management - Flink 1.2
8Mesos integration contributed by
Standalone
Flink on Yarn
Flink on Mesos
Cluster Management - Beyond 1.2
9
Efforts to seamlessly interoperate with various
cluster managers.
Generalized abstraction (FLIP-6).
Driven by and
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request
slots
TaskManager
JobManager
(2) Start
TaskManager
(3) Register
(4) Deploy Tasks
Dispatcher
(0) Start
JobManager
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request
slots
TaskManager
JobManager
(2) Start
TaskManager
(3) Register
(4) Deploy Tasks
Dispatcher
(0) Start
JobManager
Metrics
11
Metrics
▪ Rates
11
Metrics
▪ Rates
▪ Latency (operator)
11
Metrics
▪ Rates
▪ Latency (operator)
▪ Visualization in WebUI
11
Savepoint / Checkpoint Robustness
12
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
12
C S
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
12
C1 C2 C3
t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
▪ Skip failed Checkpoints
12
C1 C2 C3
t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
▪ Skip failed Checkpoints
▪ Backwards compatible
12
S
1.1 1.2
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
?
Problem: Implement custom windowing?
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
Interface ProcessingFunction:
void flatMap(I value, Context ctx, Collector<O> out) throws Exception;
void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception
Table API & Stream SQL
14
Example:
Table API & Stream SQL
▪ Group-windows
14
Example:
table
.groupBy('user')
.window(Session withGap
10.minutes on 'rowtime')
.select('uid', 'product.count')
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
14
Example:
EXISTS, VALUES, LIMIT
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
14
Example:
CURRENT_DATE, INITCAP, NULLIF
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
▪ More datatypes & better
integration
14
Example:
pojo.get('field')
pojo.flatten()
Table API & Stream SQL
▪ Group-windows
▪ More SQL operations
▪ More built-in scalar functions
▪ More datatypes & better
integration
▪ User-defined scalar functions
14
Example:
table.
select('uid',
parseName('userJson'))
Many more improvements…
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
▪ Async IO operator: non-blocking queries to external systems
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
▪ Bucketing Sink: divides output into different file w.r.t. user
logic
▪ Detached execution: first step in programatically controlled
job
▪ Async IO operator: non-blocking queries to external systems
▪ Improved scalability, robustness + bugfixes
15
Queryable State
Flink 1.2
16
Queryable State - Motivation
17
Realtime
Queries
Periodically (every second)

flush new aggregates

to Redis
Queryable State - Motivation
18
Number of

Keys
Queryable State - Motivation
19
Realtime
QueriesWhere is the bottleneck?
Queryable State - Motivation
19
Writes to the key/value

store take too long
Realtime
QueriesWhere is the bottleneck?
Queryable State - Idea
20
Realtime
Queries
Archive
Database
Optional +
only at end of windows
“Streamprocessor
as a database“
Queryable State - Performance
21
Number of

Keys
Queryable State - Implementation
22
Query Client
State

Registry
window()
/
sum()
Job Manager Task Manager
ExecutionGraph
State Location Server
deploy
status
Query: /job/operation/state-name/key
State

Registry
Task Manager
(1) Get location of "key-partition"

for "operator" of" job"
(2) Look up

location
(3)

Respond location
(4) Query

state-name and key
local

state
register
window()
/
sum()
Queryable State Enablers
23
Queryable State Enablers
▪ Flink has state as a first class citizen
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
▪ State is continuous (not mini batched)
23
Queryable State Enablers
▪ Flink has state as a first class citizen
▪ State is fault tolerant (exactly once semantics)
▪ State is partitioned (sharded) together with the
operators that create/update it
▪ State is continuous (not mini batched)
▪ State is scalable (e.g., embedded RocksDB state
backend)
23
Dynamic Scaling
Flink 1.2
24
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Resource Adaption
26
time
Workload
Resources
Motivation - Resource Adaption
26
time
Workload
Resources
time
Workload
Resources
Motivation - Resource Adaption
26
+
time
Workload
Resources
time
Workload
Resources
Basic Idea
27
• Spread work across more workers to decrease workload
Scaling Stateless Jobs
28
Scale Up Scale Down
Source
Mapper
Sink
• Scale up: Deploy new tasks
• Scale down: Cancel running tasks
Scaling Stateful Jobs
29
?
• Problem 1: Which state to assign to new task?
• Problem 2: Read + filter whole state?
Non-keyed vs Keyed State
30
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Non-keyed vs Keyed State
30
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Repartitioning Non-keyed state
31
#1 #2
#3 #4
#1 #2
#3 #4
Flink 1.1:
T snapshot()
void restore(T)
Flink 1.2:
List<T> snapshot()
void restore(List<T>)
Idea: break up state into finer granules that can be redistributed independently
Example: Kafka Source Flink 1.1
32
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
?
Operator state is black box. How to repartition?
Example: Kafka Source Flink 1.2
33
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
?
?
?
Return a list of sub-states which can be freely repartitioned.
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Non-keyed vs Keyed State
36
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed
Repartitioning Keyed State
▪ Split key space into
key groups
▪ Every key falls into
exactly one key group
▪ Assign key groups to
tasks
37
Key space
Key group #1 Key group #2
Key group #3Key group #4
One key
Repartitioning Keyed State (ct’d)
▪ Rescaling changes
key group assignment
▪ Maximum parallelism
defined by #key
groups
38
Current State in Flink 1.2
▪ Manual rescaling
1. Take savepoint
2. Restart job with adjusted parallelism and
savepoint
39
Next Steps beyond Flink 1.2
▪ Rescaling individual operators w/o restart
▪ Refactor Flink deployment and process
model (previously discussed)
▪ On-the-fly Scaling
40
Autoscaling Policies
41
• Latency
• Throughput
• Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-
autoscale) already support auto-scaling
Conclusion
42
Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
42
Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
▪ Glimpse beyond the 1.2 release
42
43
Thank you!
@stefanrrichter
@ApacheFlink
@dataArtisans
Questions?
44

More Related Content

What's hot

Apache Flink Berlin Meetup May 2016
Apache Flink Berlin Meetup May 2016Apache Flink Berlin Meetup May 2016
Apache Flink Berlin Meetup May 2016Stephan Ewen
 
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4Flink Forward
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkVerverica
 
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Flink Forward
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkVerverica
 
From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4Till Rohrmann
 
Flink Streaming @BudapestData
Flink Streaming @BudapestDataFlink Streaming @BudapestData
Flink Streaming @BudapestDataGyula Fóra
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkKostas Tzoumas
 
Flink 1.0-slides
Flink 1.0-slidesFlink 1.0-slides
Flink 1.0-slidesJamie Grier
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkFlink Forward
 
Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Kostas Tzoumas
 
Marton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingMarton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingFlink Forward
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)Robert Metzger
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 

What's hot (20)

Apache Flink Berlin Meetup May 2016
Apache Flink Berlin Meetup May 2016Apache Flink Berlin Meetup May 2016
Apache Flink Berlin Meetup May 2016
 
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4From Apache Flink® 1.3 to 1.4
From Apache Flink® 1.3 to 1.4
 
Flink Streaming @BudapestData
Flink Streaming @BudapestDataFlink Streaming @BudapestData
Flink Streaming @BudapestData
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
Flink 1.0-slides
Flink 1.0-slidesFlink 1.0-slides
Flink 1.0-slides
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
 
Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016
 
Marton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingMarton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream Processing
 
Zurich Flink Meetup
Zurich Flink MeetupZurich Flink Meetup
Zurich Flink Meetup
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming APIFlink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Big Data Warsaw
Big Data WarsawBig Data Warsaw
Big Data Warsaw
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 

Similar to A look at Flink 1.2

Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker seriesMonal Daxini
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward
 
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a ServiceSteven Wu
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasMonal Daxini
 
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...Quantyca - Data at Core
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Going Reactive with Relational Databases
Going Reactive with Relational DatabasesGoing Reactive with Relational Databases
Going Reactive with Relational DatabasesIvaylo Pashov
 
Circonus: Design failures - A Case Study
Circonus: Design failures - A Case StudyCirconus: Design failures - A Case Study
Circonus: Design failures - A Case StudyHeinrich Hartmann
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
What You Should Know About WebLogic Server 12c (12.2.1.2) #oow2015 #otntour2...
What You Should Know About WebLogic Server 12c (12.2.1.2)  #oow2015 #otntour2...What You Should Know About WebLogic Server 12c (12.2.1.2)  #oow2015 #otntour2...
What You Should Know About WebLogic Server 12c (12.2.1.2) #oow2015 #otntour2...Frank Munz
 
Using Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkUsing Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkDocker, Inc.
 
Autosys Trainer CV
Autosys Trainer CVAutosys Trainer CV
Autosys Trainer CVDS gupta
 
Data Virtualization: Revolutionizing data cloning
Data Virtualization: Revolutionizing data cloningData Virtualization: Revolutionizing data cloning
Data Virtualization: Revolutionizing data cloning Kyle Hailey
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Yuval Itzchakov
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles BackupsScyllaDB
 
Bots on guard of sdlc
Bots on guard of sdlcBots on guard of sdlc
Bots on guard of sdlcAlexey Tokar
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodePivotalOpenSourceHub
 

Similar to A look at Flink 1.2 (20)

Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker series
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a Service
 
Flink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paasFlink forward-2017-netflix keystones-paas
Flink forward-2017-netflix keystones-paas
 
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...
Create a One Click Migration (OCM) process to Automate Repeatable Infrastruct...
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Going Reactive with Relational Databases
Going Reactive with Relational DatabasesGoing Reactive with Relational Databases
Going Reactive with Relational Databases
 
Circonus: Design failures - A Case Study
Circonus: Design failures - A Case StudyCirconus: Design failures - A Case Study
Circonus: Design failures - A Case Study
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
What You Should Know About WebLogic Server 12c (12.2.1.2) #oow2015 #otntour2...
What You Should Know About WebLogic Server 12c (12.2.1.2)  #oow2015 #otntour2...What You Should Know About WebLogic Server 12c (12.2.1.2)  #oow2015 #otntour2...
What You Should Know About WebLogic Server 12c (12.2.1.2) #oow2015 #otntour2...
 
Using Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkUsing Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at Splunk
 
Autosys Trainer CV
Autosys Trainer CVAutosys Trainer CV
Autosys Trainer CV
 
Data Virtualization: Revolutionizing data cloning
Data Virtualization: Revolutionizing data cloningData Virtualization: Revolutionizing data cloning
Data Virtualization: Revolutionizing data cloning
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)
 
How Scylla Manager Handles Backups
How Scylla Manager Handles BackupsHow Scylla Manager Handles Backups
How Scylla Manager Handles Backups
 
Bots on guard of sdlc
Bots on guard of sdlcBots on guard of sdlc
Bots on guard of sdlc
 
Building Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache GeodeBuilding Apps with Distributed In-Memory Computing Using Apache Geode
Building Apps with Distributed In-Memory Computing Using Apache Geode
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

A look at Flink 1.2

  • 2. Agenda ▪ Flink 1.2 feature overview & walkthrough ▪ Taking a closer look at two features: ▪ Queryable state ▪ Dynamic scaling 2
  • 4. Flink 1.1+ ongoing development 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 5. Flink 1.1+ ongoing development 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 6. Flink 1.1+ ongoing development 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 7. Flink 1.1+ ongoing development 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 8. Flink 1.1+ ongoing development 4 Session Windows(Stream) SQL Library
 enhancements Metric
 System Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors Operations Ecosystem Application
 Features Broader
 Audience
  • 9. Flink 1.2 Improvements 5 Session Windows(Stream) SQL Library
 enhancements Metric
 System Operations Ecosystem Application
 Features Metrics &
 Visualization Dynamic Scaling Savepoint
 compatibility Checkpoints
 to savepoints Connectors in Flink Stream SQL
 Windows Large state
 Maintenance Fine grained
 recovery Side in-/outputs Window DSL Broader
 Audience Security Mesos &
 others Dynamic Resource
 Management Authentication Queryable StateApache Bahir connectors
  • 10. Security / Authentication - Flink 1.2 6 Authorized data access Secured clusters with Kerberos-based authentication • Kafka, ZooKeeper, HDFS, YARN, HBase, … Encrypted traffic between Flink Processes • RPC, Data Exchange, Web UI, … - „SSL for all connections“ Largely contributed by Prevent malicious users to hook into Flink jobs
  • 11. Cluster Management - Flink 1.1 7 Standalone Flink on Yarn
  • 12. Cluster Management - Flink 1.2 8Mesos integration contributed by Standalone Flink on Yarn Flink on Mesos
  • 13. Cluster Management - Beyond 1.2 9 Efforts to seamlessly interoperate with various cluster managers. Generalized abstraction (FLIP-6). Driven by and
  • 14. Cluster Management - Beyond (ct’d) 10 TaskManagerJobManager (1) Register (2) Deploy Tasks ResourceManager (1) Request slots TaskManager JobManager (2) Start TaskManager (3) Register (4) Deploy Tasks Dispatcher (0) Start JobManager
  • 15. Cluster Management - Beyond (ct’d) 10 TaskManagerJobManager (1) Register (2) Deploy Tasks ResourceManager (1) Request slots TaskManager JobManager (2) Start TaskManager (3) Register (4) Deploy Tasks Dispatcher (0) Start JobManager
  • 19. Metrics ▪ Rates ▪ Latency (operator) ▪ Visualization in WebUI 11
  • 20. Savepoint / Checkpoint Robustness 12
  • 21. Savepoint / Checkpoint Robustness ▪ Resume job from checkpoints 12 C S
  • 22. Savepoint / Checkpoint Robustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery 12 C1 C2 C3 t ✘
  • 23. Savepoint / Checkpoint Robustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery ▪ Skip failed Checkpoints 12 C1 C2 C3 t ✘
  • 24. Savepoint / Checkpoint Robustness ▪ Resume job from checkpoints ▪ Use older checkpoint on failed recovery ▪ Skip failed Checkpoints ▪ Backwards compatible 12 S 1.1 1.2
  • 25. Processing Function 13 Stream SQL Streaming API Processing Function Window Operator Timer Handling ? Problem: Implement custom windowing?
  • 26. Processing Function 13 Stream SQL Streaming API Processing Function Window Operator Timer Handling Interface ProcessingFunction: void flatMap(I value, Context ctx, Collector<O> out) throws Exception; void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception
  • 27. Table API & Stream SQL 14 Example:
  • 28. Table API & Stream SQL ▪ Group-windows 14 Example: table .groupBy('user') .window(Session withGap 10.minutes on 'rowtime') .select('uid', 'product.count')
  • 29. Table API & Stream SQL ▪ Group-windows ▪ More SQL operations 14 Example: EXISTS, VALUES, LIMIT
  • 30. Table API & Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions 14 Example: CURRENT_DATE, INITCAP, NULLIF
  • 31. Table API & Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions ▪ More datatypes & better integration 14 Example: pojo.get('field') pojo.flatten()
  • 32. Table API & Stream SQL ▪ Group-windows ▪ More SQL operations ▪ More built-in scalar functions ▪ More datatypes & better integration ▪ User-defined scalar functions 14 Example: table. select('uid', parseName('userJson'))
  • 34. Many more improvements… ▪ Kafka 0.10 (with watermarks) 15
  • 35. Many more improvements… ▪ Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic 15
  • 36. Many more improvements… ▪ Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job 15
  • 37. Many more improvements… ▪ Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job ▪ Async IO operator: non-blocking queries to external systems 15
  • 38. Many more improvements… ▪ Kafka 0.10 (with watermarks) ▪ Bucketing Sink: divides output into different file w.r.t. user logic ▪ Detached execution: first step in programatically controlled job ▪ Async IO operator: non-blocking queries to external systems ▪ Improved scalability, robustness + bugfixes 15
  • 40. Queryable State - Motivation 17 Realtime Queries Periodically (every second)
 flush new aggregates
 to Redis
  • 41. Queryable State - Motivation 18 Number of
 Keys
  • 42. Queryable State - Motivation 19 Realtime QueriesWhere is the bottleneck?
  • 43. Queryable State - Motivation 19 Writes to the key/value
 store take too long Realtime QueriesWhere is the bottleneck?
  • 44. Queryable State - Idea 20 Realtime Queries Archive Database Optional + only at end of windows “Streamprocessor as a database“
  • 45. Queryable State - Performance 21 Number of
 Keys
  • 46. Queryable State - Implementation 22 Query Client State
 Registry window() / sum() Job Manager Task Manager ExecutionGraph State Location Server deploy status Query: /job/operation/state-name/key State
 Registry Task Manager (1) Get location of "key-partition"
 for "operator" of" job" (2) Look up
 location (3)
 Respond location (4) Query
 state-name and key local
 state register window() / sum()
  • 48. Queryable State Enablers ▪ Flink has state as a first class citizen 23
  • 49. Queryable State Enablers ▪ Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) 23
  • 50. Queryable State Enablers ▪ Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it 23
  • 51. Queryable State Enablers ▪ Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it ▪ State is continuous (not mini batched) 23
  • 52. Queryable State Enablers ▪ Flink has state as a first class citizen ▪ State is fault tolerant (exactly once semantics) ▪ State is partitioned (sharded) together with the operators that create/update it ▪ State is continuous (not mini batched) ▪ State is scalable (e.g., embedded RocksDB state backend) 23
  • 54. Motivation - Changing Workloads 25
  • 55. Motivation - Changing Workloads 25
  • 56. Motivation - Changing Workloads 25
  • 57. Motivation - Resource Adaption 26 time Workload Resources
  • 58. Motivation - Resource Adaption 26 time Workload Resources time Workload Resources
  • 59. Motivation - Resource Adaption 26 + time Workload Resources time Workload Resources
  • 60. Basic Idea 27 • Spread work across more workers to decrease workload
  • 61. Scaling Stateless Jobs 28 Scale Up Scale Down Source Mapper Sink • Scale up: Deploy new tasks • Scale down: Cancel running tasks
  • 62. Scaling Stateful Jobs 29 ? • Problem 1: Which state to assign to new task? • Problem 2: Read + filter whole state?
  • 63. Non-keyed vs Keyed State 30 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 64. Non-keyed vs Keyed State 30 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 65. Repartitioning Non-keyed state 31 #1 #2 #3 #4 #1 #2 #3 #4 Flink 1.1: T snapshot() void restore(T) Flink 1.2: List<T> snapshot() void restore(List<T>) Idea: break up state into finer granules that can be redistributed independently
  • 66. Example: Kafka Source Flink 1.1 32 partitionId: 1, offset: 42 partitionId: 3, offset: 10 partitionId: 6, offset: 27 ? Operator state is black box. How to repartition?
  • 67. Example: Kafka Source Flink 1.2 33 partitionId: 1, offset: 42 partitionId: 3, offset: 10 partitionId: 6, offset: 27 ? ? ? Return a list of sub-states which can be freely repartitioned.
  • 68. partitionId: 1, offset: 42 partitionId: 6, offset: 27 Example: Kafka Source Flink 1.2 34 partitionId: 3, offset: 10 Scale Out
  • 69. partitionId: 1, offset: 42 partitionId: 6, offset: 27 Example: Kafka Source Flink 1.2 34 partitionId: 3, offset: 10 Scale Out
  • 70. Example: Kafka Source Flink 1.2 35 partitionId: 1, offset: 42 partitionId: 6, offset: 27 partitionId: 3, offset: 10 Scale In
  • 71. Example: Kafka Source Flink 1.2 35 partitionId: 1, offset: 42 partitionId: 6, offset: 27 partitionId: 3, offset: 10 Scale In
  • 72. Non-keyed vs Keyed State 36 • State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“ • State bound only to operator • E.g. Source state KeyedNon-keyed
  • 73. Repartitioning Keyed State ▪ Split key space into key groups ▪ Every key falls into exactly one key group ▪ Assign key groups to tasks 37 Key space Key group #1 Key group #2 Key group #3Key group #4 One key
  • 74. Repartitioning Keyed State (ct’d) ▪ Rescaling changes key group assignment ▪ Maximum parallelism defined by #key groups 38
  • 75. Current State in Flink 1.2 ▪ Manual rescaling 1. Take savepoint 2. Restart job with adjusted parallelism and savepoint 39
  • 76. Next Steps beyond Flink 1.2 ▪ Rescaling individual operators w/o restart ▪ Refactor Flink deployment and process model (previously discussed) ▪ On-the-fly Scaling 40
  • 77. Autoscaling Policies 41 • Latency • Throughput • Resource utilization • Kubernetes on GCE, EC2 and Mesos (marathon- autoscale) already support auto-scaling
  • 79. Conclusion ▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling 42
  • 80. Conclusion ▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling ▪ Glimpse beyond the 1.2 release 42