SK telecom shares our experience of using Flink in building a solution for Predictive Maintenance (PdM). Our PdM solution named metatron PdM consists of (1) a Deep Neural Network (DNN)-based prediction model for precise prediction, and (2) a Flink-based runtime system which applies the model to a sliding window on sensor data streams. Efficient handling of multi-sensor streaming data for real-time prediction of equipment condition is a critical component of our product. In this talk, we first show why we choose Flink as a core engine for our streaming use case in which we generate real-time predictions using DNNs trained with Keras on top of TensorFlow and Theano. In addition, we present a comparative study of methods to exploit learning models on JVM such as directly using Python libraries on CPython embedded in JVM, using TensorFlow Java API (including Flink TensorFlow), and making RPC calls to TensorFlow Serving. We then explain how we implement the runtime system using Flink DataStream API, especially with event time, various window mechanisms, timestamp and watermark, custom source and sink, and checkpointing. Lastly, we present how we use the official Flink Docker image for solution delivery and the Flink metric system for monitoring and management of our solution. We hope our use case sets a good example of building a DNN-based streaming solution using Flink.
2. • Big Data processing engines
• MapReduce, Tez, Spark, Flink, Hive, Presto
• Recent interest
• Deep learning model serving (TensorFlow serving)
• Containerization (Docker, Kubernetes)
• Time-series data (InfluxDB, Prometheus, Grafana)
• Flink Forward 2015 Berlin
• A Comparative Performance Evaluation of Flink
About me (@eastcirclek)
Covered
in this talk
3. Refinery and semiconductor companies depend on equipment
Breakdown of equipment largely affects company profit
4. Equipment maintenance to minimize breakdown
Breakdown
Planned Maintenance
Shutdown equipment on a regular basis for
parts replacement, cleaning, adjustments, etc.
Time
2015.07 2016.07 2017.07
Predictive Maintenance (PdM)
Unplanned maintenance based on
prediction of equipment condition
Predictive
Maintenance
Planned
Maintenance
...
<equipment sensors> <predictive maintenance system>
5. Our approach to Predictive Maintenance
Better prediction of equipment condition
using Deep Learning
PdM
Planned
Maintenance
Breakdown
PdM
Planned
Maintenance
Breakdown
Machine Learning
& Statistics
Machine learning
& Statistics
+ Deep Learning
6. Contents
1
Why we use Flink
for our time-series prediction model
2
Flink pipeline design for
rendezvous and DNN ensemble
3
Solution packaging and monitoring
with Docker and Prometheus
Time-series prediction model
TF Serving
Grafana
Flink MySQL
PrometheusInfluxDB
7. Role Toolbox
My team consists of two groups
* The diagram is based on Ted Dunning’s slides (Flink Forward 2017 SF)
Data Engineers
Model Developers
*
history
data
training
DNN
model
*
live
data
prediction
&
alarm
DNN
model
8. Model developers give Convolutional LSTM to engineers
CNN
RNN
(LSTM)
...
...
Output (Ŷ)
expected sensor values
after 2 days
U
W3 W3
U
W3
O
U
W2
W1
Ŷ
Input (X)
1 week time-series (10080 records)
Model developer
It does not return whether
the equipment breaks down
after 2 days
...
* assuming one minute sampling rate
9. Data engineers apply Convolutional LSTM to live sensor data
Sensors Vector
y1
y2
y3
ym
......
Equipment
We have multi-sensor data
10. Data engineers apply Convolutional LSTM to live sensor data
Multi-sensor data arrive
at a fixed interval of 1 minute
Sensors Vector
y1
y2
y3
ym
......
Equipment
...
Timeline
11. ...
Data engineers apply Convolutional LSTM to live sensor data
X (10080 records)X (10080 records)X (10080 records)X (10080 records)
We maintain a count window
over the latest 10080 records
It slides as a new record arrives
(a sliding count window)
Timeline
Vector
y1
y2
y3
ym
...
Sensors
...
Equipment
12. Data engineers apply Convolutional LSTM to live sensor data
2 days
...
Y
y1
y2
y3
ym
...
Ŷ
ŷ1
ŷ2
ŷ3
ŷm
...
Given one week time-series (X),
DNN returns predicted values
after 2 days (Ŷ)
Ŷ
Raise an alarm
if the distance of two vectors is
above a defined threshold
...U
W3 W3
U
W3
U
W2
W1
O
...
CNN
RNN
(LSTM)
X (10080 records)
Timeline
Whenever the sliding window moves,
we apply Convolutional LSTM to ....
13. Stream
source
Sliding
count
window
SinkScoreJoin
Desired streaming topology by data engineers
Apply DNN to X Ŷ
...
timeline
...
X
Requirement 1
Count window to maintain
10080 records instead of
1 week event-time window
Requirement 2
Joining of two streams
on event time
(Rendezvous)
Prediction
stream
Ŷ
Outlier
filter
Y
Measurement
stream
14. Proof-of-Concept of Spark Structured Streaming
Types
Score
Streaming
Dataset
Input
Streaming
Dataset
Prediction
Streaming
Dataset
joining of two streams
apply DNN to 1 week time-series,
not 10080 records
(Sliding count window is not supported)
generate an input stream
from local files
15. Inner join between two streaming Datasets
is not supported
in Spark Structured Streaming
Score
Dataset
16. Unsupported Operations on Spark Structured Streaming (v2.2)
Requirement 1
Count window to maintain
10080 records instead of
1 week event-time window
Requirement 2
Joining of two streams
on event time
(Rendezvous)
17. That’s why we move to Flink DataStream API
• Sliding count window : not supported
• Joining of two streams : not supported
• Micro-batch behind the scene
• Continuous processing proposed in SPARK-20928
• Sliding count window : supported
• Joining of two streams : supported
• Scalability and performance proved by other use cases
Spark Structured Streaming
Flink DataStream API
* it could be possible to use our Convolutional LSTM model using Spark Structured Stream in some other way
18. Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Outlier
sink
Prediction
Sink
Score
Sink
W(t) @t
Input
Sink
19. Flink can faithfully implement our streaming topology design
Stream
source
Sliding count
window
SinkScoreJoin
Apply DNN to X Ŷ
...
...
X
Y
Ŷ
Outlier
filter
<Topology design> <Flink implementation>
addSource process
countWindowAll
(custom evictor)
apply
assignTimestampsAndW
atermarks
join applywindow
...
... Ŷ
20. Contents
1
Why we use Flink
for our time-series prediction model
2
Flink pipeline design for
rendezvous and DNN ensemble
3
Solution packaging and monitoring
with Docker and Prometheus
Time-series prediction model
TF Serving
Grafana
Flink MySQL
PrometheusInfluxDB
22. Stateful custom source to read from MySQL
• We assume that sensor data arrive in order
• Emit an input record and a watermark of the same time
• Increase lastTimestamp afterward (11:15 11:16)
• Exactly-once semantics
• Store lastTimestamp when taking a snapshot
• Restore lastTimestamp when restarted
addSource
lastTimestamp
(state)
2017-9-13 11:15
2017-9-13 11:16 [y1, y2, ..., y 𝑚]
JDBC Connection
SELECT timestamp, measured
FROM input
WHERE timestamp>$lastTimestamp
W
(11:16)
2017-9-13 11:13 [y1, y2, ..., y 𝑚]
2017-9-13 11:14
2017-9-13 11:15
timestamp measured
[y1, y2, ..., y 𝑚]
[y1, y2, ..., y 𝑚]
Input table in MySQL
2017-9-13 11:16 [y1, y2, ..., y 𝑚] @
11:16
23. Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Prediction
Sink
Score
Sink
W(t) @t
filter out
outliers
maintain last N
elements
Emit outliers to
a side output
Event time window for 1 week
cannot guarantee 10080 records
as data can be missing or filtered
We define
a custom evictor
24. What if data is absent or filtered for a long period of time?
They look totally different!
3 missing days
We’d better start a new sliding window
for the time-series!
25. CustomEvictor.of(3, timeThreshold=4)
CustomEvictor evicts
all but the last one
when the last one occurs
after timeThreshold
CountEvictor.of(3)
CountEvictor evicts
elements beyond
its capacity
How to start a new sliding count window after a long break
timeline
1 2 3 4 95 6 7 8
no records for a while
CountTrigger.of(1)
fires every time
a record comes in
EvictingWindowOperator
adds a new input record to
InternalListState
92 3 4
2 3 4 9
2 3 4 9
Sliding count window
of size 3
We want to start a new window
after missing 4 timestamps
26. Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Outlier
sink
Prediction
Sink
Score
Sink
W(t) @t
Input
sink
27. Working with model developers
They stick to using Python
They develop models using a Python library called Keras
I don’t want to use
Deeplearning4J
because that’s Java…
We use Keras on Python!
I want to develop our
solution on JVM!
Why don’t we
develop models using
Deeplearning4J?
28. How to load Keras models in Flink?
I don’t know how
to have it!
29. Loading Keras models in JVM
• Convert Keras models to TensorFlow SavedModel
• use tensorflow.python.saved_model.builder.SavedModelBuilder
• TensorFlow Java API (Flink TensorFlow)
• Do inference inside the JVM process
• TensorFlow Serving
• Do inference outside the JVM process
• Execute Keras through CPython inside JVM
• Do inference inside the JVM process
• Java Embedded Python (JEP) to ease the use of CPython
• https://github.com/ninia/jep
• Use KerasModelImport from Deeplearning4J
• Not mature enough
30. Comparison of approaches to use Keras models in JVM
TaskManager Process
RichWindowFunction
TensorFlow
Java API
Java Embedded
Python (JEP)
TaskManager Process
RichWindowFunction
TaskManager Process
TensorFlow
Serving
RichWindowFunction
TensorFlow
Native Library
TensorFlow Java API
(very thin wrapper)
Ŷ...
X
CPython
JEP Java object
JEP native code
Ŷ...
X
Saved
Model
Keras
model
Saved
Model
Keras
model
gRPC client
...
X
Ŷ
TFServing process
DynamicManager
Loader
SavedModel
v1
SavedModel
v2Keras
model
Execute Python commands
- import keras
- load a model & weights
- pass X and get Ŷ
Saved
Model
31. Comparison of runtime inference performance
TensorFlow Java API
77.7 milliseconds per inference
TensorFlow Serving
71.2 milliseconds per inference
Keras inside CPython
w/ TensorFlow backend
32 milliseconds per inference
(* Theano backend is extremely slow in our case)
(* We do not batch inference calls)
32. Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Outlier
sink
Prediction
Sink
Score
Sink
W(t) @t
Input
sink
Tumbling
EventTimeWindows
(size = interval)
33. Joining two streams on event time
• At a certain time t,
• Y of timestamp t is arriving
• Ŷ of timestamp t+2d is arriving
• Ŷ of timestamp t has arrived two days ago
• TumblingEventTimeWindows.of( Time.seconds(timeUnit) )
• To maintain a window for a single pair of Y and Ŷ
• A window is triggered when watermarks from both streams have arrived
join windowMeasurement
stream
...
@t+2d
@t
@t
apply
assignTimestamps
AndWatermarks
Prediction
stream
Tumbling
EventTime
Windows
@tW(t)
Y
@t+2dW(t+2d)
Ŷ
+2
days
trigger!
34. Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Outlier
sink
Prediction
Sink
W(t) @t
Raise an alarm
if the distance of Ŷ and Y
goes beyond
a defined threshold
Input
sink
35. Score
Sink
Prediction
Sink
Input
sink
Data processing pipeline with Flink DataStream API
addSource process
countWindowAll
(custom evictor)
apply
assignTimestamps
AndWatermarks
join applywindow
...
timeline
...
X
Ŷ
Measurement
stream
Prediction
stream
+2
days
Outlier
sink
W(t) @t
37. Predicting from a single DNN is not enough!
Prediction from a single DNN Possibly biased predictionMeasurement
Prediction from
an ensemble of 10 DNNs
...
...
Measurement More reliable prediction!
38. mean
DNN ensemble for reliable prediction
Ŷ Ŷ Ŷ
Different Convolutional LSTMNs return slightly different prediction results
Ŷ
Timeline
Y
2 days
...
...
...
Raise an alarm
if the distance of two vectors
is above a defined threshold
X (one week time-series) ...
...
39. Data processing pipeline with Flink DataStream API
addSource process
join applywindow
Measurement
stream
Prediction
stream
Outlier
sink
Prediction
Sink
Score
Sink
Ŷ
…
DNN ensemble
Ŷ
Ŷ
Ŷ
Input
sink
how to implement our ensemble pipeline?
...
...
...
... ...
mean
Ŷ Ŷ Ŷ
Ŷ
…
41. Distribute 10 keys evenly over 10 partitions
Carefully generate keys not to belong to the same partitions
flatMap keyBy
(murmurHash)
PARTITION_0KEY_0,
PARTITION_1KEY_1,
PARTITION_9KEY_9,
…
KEY_1, KEY_9,…
KEY_0,
replicate 10 times with different keys
PARTITION = murmurHash(KEY) / maxParallalism*(parallelism/maxParallism)
42. Contents
1
Why we use Flink
for our time-series prediction model
2
Flink pipeline design for
rendezvous and DNN ensemble
3
Solution packaging and monitoring
with Docker and Prometheus
TF Serving
Grafana
Flink MySQL
PrometheusInfluxDB
Time-series prediction model
43. A simple software stack on top of Docker
Customer machine
MySQL
official image
Grafana
official image
Prometheus
official image
TensorFlow
Serving
official image
InfluxDB
official image
Docker engine
Flink
official image
No custom Docker image!
A single yml file is okay to deploy our software stack!
44. Launch JobManager & TaskManager with some changes
in the official repository of the Docker image for Flink
You need to get
flink-metrics-prometheus-1.4-SNAPSHOT.jar
by yourself
until Flink-1.4 is officially released
metrics.reporter : prom
metrics.reporter.prom.class :
org.apache.flink.metrics.prometheus.PrometheusReporter
jobmanager.heap.mb : 10240
taskmanager.heap.mb : 10240
45. * Every process runs inside a Docker container
Flink JobManager
TaskManager TaskManager TaskManager
Prometheus
Prometheus scrapes
HTTP endpoints of
metrics exporters
specified in configuration
Grafana
Solution
deployment
:9249/metrics
:9249/metrics
Flink runtime metrics
& Custom metrics
:9104/metrics
MySQLd Exporter
MySQL
metrics
CPU/Disk
Memory
Network
Servers
System
metrics
:9100/metrics
Node Exporter
:8080/metrics
Container
metrics
cAdvisor
TensorFlow
Serving
inference MySQL
source
InfluxDB
sink
Docker
Submits a Flink Job
to launch our pipeline
InfluxDB
Sensor
time-series
dashboard
Solution
monitoring
dashboard
* if using TFServing
46. Solution monitoring dashboard
* this dashboard is based on ”Docker Dashboard” by Brian Christner
Server
Mem/CPU/FS
usage
(by node exporter)
Container
CPU usage
(by cAdvisor)
Inference time
from each DNN
(custom metrics)
TaskManager
JVM memory
Usage
# records
written to sinks
(custom metrics)
47. Recap – contents
1
Why we use Flink
for our time-series prediction model
2
Flink pipeline design for
rendezvous and DNN ensemble
3
Solution packaging and monitoring
with Docker and Prometheus
Time-series prediction model
TF Serving
Grafana
Flink MySQL
PrometheusInfluxDB
48. Conclusion
• Flink helps us concentrate on the core logic
• DataStream API is just like a natural language in presenting streaming topologies
• flexible windowing mechanisms (count window and evictor)
• joining of two streams on event time
• Thanks to it, we can focus on
• implementation of custom source/sink to meet customer requirements
• interaction with DNN ensembles
• It has a nice ecosystem to help build a solution
• Docker
• Prometheus metric reporter
50. You cannot use TFServing with Flink-1.3.2
• Netty binary incompatbility
• flink-runtime_2.11:1.3.2
• depends on io.netty:4.0.27.Final
• grpc-netty:4.1.14
• depends on io.netty:4.1.14.Final
• You could use grpc-okhttp instead of grpc-netty
• grpc-okhttp conflicts with another library (influxdb client for Java)
• You can use TFServing with FLINK-1.4-SNAPSHOT
• [FLINK-7013] Add shaded netty dependency
• io.netty.* is recently relocated to org.apache.flink.shaded.netty4.*
From “New and noteworthy in 4.1” page of Netty project
4.1 contains multiple additions which might not be fully backward-compatible with 4.0
51. Looking forward to the official release of v1.4
No more 1.4-SNAPSHOT on the customer site!