Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pre-Hadoop Summit Meetups)

Stephan Ewen
Flink committer
co-founder / CTO @ data Artisans
@StephanEwen
Apache
Flink

Stratosphere 0.4
4
Stratosphere Optimizer
Pact API (Java)
Stratosphere Runtime
DataSet API (Scala)
Local Remote
Batch processing on a pipelining engine, with iterations …

Flink
Historic data
Kafka, RabbitMQ, ...
HDFS, JDBC, ...
ETL, Graphs,
Machine Learning
Relational, …
Low latency,
windowing,
aggregations, ...
Event logs
Real-time data
streams
What is Apache Flink?
(master)

What is Apache Flink?
7
Python
Gelly
Table
ML
SAMOA
Flink Optimizer
DataSet (Java/Scala) DataStream (Java/Scala)
Stream Builder
Hadoop
M/R
Local Remote Yarn Tez Embedded
Dataflow
Dataflow
Flink Dataflow Runtime
HDFS
HBase
Kafka
RabbitMQ
Flume
HCatalog
JDBC

Batch / Steaming APIs
8
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.window(Count.of(1000)).every(Count.of(100))
.groupBy("word").sum("frequency")
.print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.groupBy("word").sum("frequency")
.print()
DataSet API (batch):
DataStream API (streaming):

Technology inside Flink
case class Path (from: Long, to:
Long)
val tc = edges.iterate(10) {
paths: DataSet[Path] =>
val next = paths
.join(edges)
.where("to")
.equalTo("from") {
(path, edge) =>
Path(path.from, edge.to)
}
.union(paths)
.distinct()
next
}
Cost-based
optimizer
Type extraction
stack
Task
scheduling
Recovery
metadata
Pre-flight (Client)
Master
Workers
DataSourc
e
orders.tbl
Filter
Map
DataSourc
e
lineitem.tbl
Join
Hybrid Hash
build
HT
probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
Dataflow
Graph
Memory
manager
Out-of-core
algos
Batch &
Streaming
State &
Checkpoints
deploy
operators
track
intermediate
results

Flink by Feature / Use Case
10

Life of data streams
 Create: create streams from event sources
(machines, databases, logs, sensors, …)
 Collect: collect and make streams available for
consumption (e.g., Apache Kafka)
 Process: process streams, possibly generating
derived streams (e.g., Apache Flink)
12

Stream Analysis in Flink
13
More at: http://flink.apache.org/news/2015/02/09/streaming-example.html

Defining windows in Flink
 Trigger policy
• When to trigger the computation on current window
 Eviction policy
• When data points should leave the window
• Defines window width/size
 E.g., count-based policy
• evict when #elements > n
• start a new window every n-th element
 Built-in: Count, Time, Delta policies
14

Checkpointing / Recovery
 Flink acknowledges batches of records
• Less overhead in failure-free case
• Currently tied to fault tolerant data sources
(e.g., Kafka)
 Flink operators can keep state
• State is checkpointed
• Checkpointing and record acks go together
 Exactly one semantics for state
15

Checkpointing / Recovery
16
Chandy-Lamport Algorithm for consistent asynchronous distributed snapshots
Pushes checkpoint barriers
through the data flow
Operator checkpoint
starting
Checkpoint done
Data Stream
barrier
Before barrier =
part of the snapshot
After barrier =
Not in snapshot
Checkpoint done
checkpoint in progress
(backup till next snapshot)

Heavy Data Pipelines
18
Complex ETL programs
Apology: Graph had to be blurred for
online slides, due to confidentiality

Memory Management
public class WC {
public String word;
public int count;
}
empty
page
Pool of Memory Pages
Sorting,
hashing,
caching
Shuffling,
broadcasts
User code
objects
ManagedUnmanaged
19
Flink contains its own memory management stack. Memory is
allocated, de-allocated, and used strictly using an internal buffer pool
implementation. To do that, Flink contains its own type extraction and
serialization components.
More at: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741525

Smooth out-of-core performance
20
More at: http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
Single-core join of 1KB Java objects beyond memory (4 GB)
Blue bars are in-memory, orange bars (partially) out-of-core

Benefits of managed memory
 More reliable and stable performance (less GC
effects, easy to go to disk)
21

Table API
22
val customers = envreadCsvFile(…).as('id, 'mktSegment)
.filter( 'mktSegment === "AUTOMOBILE" )
val orders = env.readCsvFile(…)
.filter( o => dateFormat.parse(o.orderDate).before(date) )
.as('orderId, 'custId, 'orderDate, 'shipPrio)
val items = orders
.join(customers).where('custId === 'id)
.join(lineitems).where('orderId === 'id)
.select('orderId,'orderDate,'shipPrio,
'extdPrice * (Literal(1.0f) - 'discount) as 'revenue)
val result = items
.groupBy('orderId, 'orderDate, 'shipPrio)
.select('orderId, 'revenue.sum, 'orderDate, 'shipPrio)

Iterations in Data Flows
 Machine Learning Algorithms
23

Iterate by looping
 for/while loop in client submits one job per
iteration step
 Data reuse by caching in memory and/or disk
Step Step Step Step Step
Client
24

Iterate in the Dataflow
25
partial
solution
partial
solutionX
other
datasets
Y
initial
solution
iteration
result
Replace
Step function

Large-Scale Machine Learning
26
Factorizing a matrix with
28 billion ratings for
recommendations
(Scale of Netflix
or Spotify)
More at: http://data-artisans.com/computing-recommendations-with-flink.html

State in Iterations
 Graphs and Machine Learning
27

Iterate natively with deltas
28
partial
solution
delta
setX
other
datasets
Y
initial
solution
iteration
result
workset A B workset
Merge deltas
Replace
initial
workset

Effect of delta iterations…
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
45000000
1 6 11 16 21 26 31 36 41 46 51 56 61
#ofelementsupdated
iteration

… very fast graph analysis
30
… and mix and match
ETL-style and graph analysis
in one program
Performance competitive
with dedicated graph
analysis systems
More at: http://data-artisans.com/data-analysis-with-flink.html

Flink Roadmap for 2015
 Out-of-core state in Streaming
 Monitoring and scaling for streaming
 Streaming Machine Learning with SAMOA
 More additions to the libraries
• Batch Machine Learning
• Graph library additions (more algorithms)
 SQL on top of expression language
 Master failover
32

Flink community
0
20
40
60
80
100
120
Aug-10 Feb-11 Sep-11 Apr-12 Oct-12 May-13 Nov-13 Jun-14 Dec-14 Jul-15
#unique contributor ids by git commits

Cornerpoints of Flink Design
36
Robust Algorithms on
Managed Memory
Pipelined Execution
of Batch Programs
 Better shuffle performance
 No OutOfMemory Errors
 Scales to very large JVMs
 Efficient an robust processing
Flexible Data
Streaming Engine
 Low Latency Steam Proc.
 Highly flexible windows
Native Iterations
 Very fast Graph Processing
 Stateful Iterations for ML
High-level APIs,
beyond key/value pairs
 Java/Scala/Python (upcoming)
 Relational-style optimizer
 Graphs / Machine Learning
 Streaming ML (coming)
 Scales to very large groups
Active Library Development

A simple program
38
val orders = …
val lineitems = …
val filteredOrders = orders
.filter(o => dataFormat.parse(l.shipDate).after(date))
.filter(o => o.shipPrio > 2)
val lineitemsOfOrders = filteredOrders
.join(lineitems)
.where(“orderId”).equalTo(“orderId”)
.apply((o,l) => new SelectedItem(o.orderDate, l.extdPrice))
val priceSums = lineitemsOfOrders
.groupBy(“orderDate”).sum(“l.extdPrice”);

Two execution plans
39
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
broadcast forward
Combine
GroupRed
sort
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
hash-part [0] hash-part [0]
hash-part [0,1]
GroupRed
sort
forwardBest plan
depends on
relative sizes
of input files

Examples of optimization
 Task chaining
• Coalesce map/filter/etc tasks
 Join optimizations
• Broadcast/partition, build/probe side, hash or sort-
merge
 Interesting properties
• Re-use partitioning and sorting for later operations
 Automatic caching
• E.g., for iterations
40

Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pre-Hadoop Summit Meetups)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pre-Hadoop Summit Meetups)

Similar to Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pre-Hadoop Summit Meetups) (20)

More from Stephan Ewen

More from Stephan Ewen (7)

Recently uploaded

Recently uploaded (20)

Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pre-Hadoop Summit Meetups)

Editor's Notes