SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Tzu-Li (Gordon) Tai
Flink Committer / PMC Member
tzulitai@apache.org
@tzulitai
Managing State in Apache Flink
Original creators of Apache
Flink®
Providers of
dA Platform 2, including
open source Apache Flink +
dA Application Manager
Users are placing more and more
of their most valuable assets within
Flink: their application data
@
Complete social network implemented
using event sourcing and
CQRS (Command Query Responsibility Segregation)
Classic tiered architecture Streaming architecture
database
layer
compute
layer
compute
+
stream storage
and
snapshot storage
(backup)
application
state
Streams and Snapshots
6
Version control
Exactly-once guarantees Job reconfiguration
Asynchronous, pipelined
State
Snapshots
… want to ensure that users can
fully entrust Flink with their state,
as well as good quality of life with
state management
State management FAQs ...
● State declaration best practices?
● How is my state serialized and persisted?
● Can I adapt how my state is serialized?
● Can I adapt the schema / data model of
my state?
8
Recap: Flink’s Managed State
9
Flink Managed State
10
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Flink Managed State
11
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
unique identifier
for this managed
state
Flink Managed State
12
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
type information
for the state
Flink Managed State
13
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Local
State Backend
registers
state
Flink Managed State
14
...
...
Your
Code
Your
Code
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(
“my-value-state”,
MyPojo.class
);
ValueState<MyPojo> state =
getRuntimeContext().getState(desc);
MyPojo p = state.value();
...
state.update(...)
...
Local
State Backend
read
write
Flink Managed State (II)
15
State
State
State
State
State
State
State
DFS
Checkpoints
- Pipelined
checkpoint barriers
flow through the
topology
- Operators
asynchronously
backup their state in
checkpoints on a
DFS
checkpointed state
Flink Managed State (II)
16
State
State
State
State
State
State
State
DFS
Restore
- Each operator is
assigned their
corresponding
file handles, from
which they
restore their state
checkpointed state
State Serialization Behaviours
17
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
checkpointed state
State Serialization Behaviours
18
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
Java object read / writes
checkpointed state
State Serialization Behaviours
19
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
Java object read / writes
serialize
on chckpts /
savepoints
Checkpoint
checkpointed state
State Serialization Behaviours
20
State
State
State
State
State
State
State
DFS
JVM Heap backed
state backends
(MemoryStateBackend,
FsStateBackend)
⇒ lazy serialization +
eager deserialization
deserialize
to objects
on restore
Restore
checkpointed state
State Serialization Behaviours
21
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
checkpointed state
State Serialization Behaviours
22
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
checkpointed state
State Serialization Behaviours
23
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
exchange
of state bytes
Checkpoint
checkpointed state
State Serialization Behaviours
24
State
State
State
State
State
State
State
DFS
Out-of-core state backends
(RocksDBStateBackend)
⇒ eager serialization +
lazy deserialization
De-/serialize on every
local read / write
exchange
of state bytes
Restore
checkpointed state
State Declaration
25
Matters of State (I)
State Access Optimizations
26
ValueStateDescriptor<Map<String, MyPojo>> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class);
ValueState<Map<String, MyPojo>> state =
getRuntimeContext().getState(desc);
● Flink supports different state “structures” for
efficient state access for various patterns
Map<String, MyPojo> map = state.value();
map.put(“someKey”, new MyPojo(...));
state.update(map);
X don’t do
State Access Optimizations
27
MapStateDescriptor<String, MyPojo> desc =
new MapStateDescriptor<>(“my-value-state”, String.class, MyPojo.class);
MapState<String, MyPojo> state =
getRuntimeContext().getMapState(desc);
● Flink supports different state “structures” for
efficient state access for various patterns
MyPojo pojoVal = state.get(“someKey”);
state.put(“someKey”, new MyPojo(...));
optimized for
random key access
Declaration Timeliness
28
● Try registering state as soon as possible
• Typically, all state declaration can be done in the “open()”
lifecycle of operators
Declaration Timeliness
29
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
private static ValueStateDescriptor<MyPojo> DESC =
new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class);
@Override
public String map(String input) {
MyPojo pojo = getRuntimeContext().getState(DESC).value();
…
}
}
Declaration Timeliness
30
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
private static ValueStateDescriptor<MyPojo> DESC =
new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class);
private ValueState<MyPojo> state;
@Override
public void open(Configuration config) {
state = getRuntimeContext().getState(DESC);
}
@Override
public String map(String input) {
MyPojo pojo = state.value();
…
}
}
Eager State Declaration
31
● Under discussion with FLIP-22
● Allows the JobManager to have knowledge on
declared states of a job
(under discussion,
release TBD)
Eager State Declaration
32
(under discussion,
release TBD)
public class MyStatefulMapFunction extends RichMapFunction<String, String> {
@KeyedState(
stateId = “my-pojo-state”,
queryableStateName = “state-query-handle”
)
private MapState<String, MyPojo> state;
@Override
public String map(String input) {
MyPojo pojo = state.get(“someKey”);
state.put(“someKey”, new MyPojo(...));
}
}
State Serialization
33
Matters of State (II)
How is my state serialized?
34
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class);
● Provided state type class info is analyzed by Flink’s own serialization
stack, producing a tailored, efficient serializer
● Supported types:
• Primitive types
• Tuples, Scala Case Classes
• POJOs (plain old Java objects)
● All non-supported types fallback to using Kryo for serialization
Avoid Kryo for State Serde
35
● Kryo is generally not recommended for use on persisted data
• Unstable binary formats
• Unfriendly for data model changes to the state
● Serialization frameworks with schema evolution support is
recommended: Avro, Thrift, etc.
● Register custom Kryo serializers for your Flink job that uses these
frameworks
Avoid Kryo for State Serde
36
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// register the serializer included with
// Apache Thrift as the standard serializer for your type
env.registerTypeWithKryoSerializer(MyCustomType, TBaseSerializer.class);
● Also see: https://goo.gl/oU9NxJ
Custom State Serialization
37
● Instead of supplying type information to be analyzed, directly
provide a TypeSerializer
public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> {
...
}
ValueStateDescriptor<MyCustomType> desc =
new ValueStateDescriptor<>(“my-value-state”, new MyCustomTypeSerializer());
State Migration
38
Matters of State (III)
State migration / evolution
39
● Upgrading to more efficient serialization schemes for performance
improvements
● Changing the schema / data model of state types, due to evolving
business logic
Upgrading State Serializers
40
● The new upgraded serializer would either be compatible, or not
compatible. If incompatible, state migration is required.
● Disclaimer: as of Flink 1.3, upgraded serializers must be compatible, as
state migration is not yet an available feature.
ValueStateDescriptor<MyCustomType> desc =
new ValueStateDescriptor<>(“my-value-state”, new UpgradedSerializer<MyCustomType>());
ValueStateDescriptor<MyPojo> desc =
new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); // modified MyPojo type
Case #2: New custom serializer
Case #1: Modified state types, resulting in different Flink-generated serializers
Upgrading State Serializers (II)
41
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “B”
Field “A”
New serializer at
restore time
Upgrading State Serializers (II)
42
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “B”
Field “A”
New serializer at
restore time
reconfigure
Upgrading State Serializers (II)
43
● On restore, new serializers are checked against the metadata of the
previous serializer (stored together with state data in savepoints) for
compatibility.
● All Flink-generated serializers define the metadata to be persisted. Newly
generated serializers at restore time are reconfigured to be compatible.Savepoint
state
data
serializer
config
metadataField “A”
Field “B”
Old serializer at time
of savepoint
Field “A”
Field “B”
New serializer at
restore time
Upgrading State Serializers (III)
44
● Custom serializers must define the metadata to write:
TypeSerializerConfigSnapshot
● Also define logic for compatibility checks
public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> {
...
TypeSerializerConfigSnapshot snapshotConfiguration();
CompatibilityResult<MyCustomType> ensureCompatibility(
TypeSerializerConfigSnapshot configSnapshot);
}
Closing
45
TL;DR
46
● The Flink community takes state management in Flink very seriously
● We try to make sure that users will feel comfortable in placing their
application state within Flink and avoid any kind of lock-in.
● Upcoming changes / features to expect related to state management:
Eager State Declaration & State Migration
4
Thank you!
@tzulitai
@ApacheFlink
@dataArtisans
We are hiring!
data-artisans.com/careers

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkFlink Forward
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistentconfluent
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebookAniket Mokashi
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...HostedbyConfluent
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 

Was ist angesagt? (20)

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebook
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 

Ähnlich wie Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink

Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward
 
State schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsState schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsTzu-Li (Gordon) Tai
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQLGrant Fritchey
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsDatabricks
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...confluent
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Cdcr apachecon-talk
Cdcr apachecon-talkCdcr apachecon-talk
Cdcr apachecon-talkAmrit Sarkar
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And DesignYaroslav Tkachenko
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...Paris Carbone
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Codemotion
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Andrzej Ludwikowski
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaData Con LA
 
Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)Aljoscha Krettek
 

Ähnlich wie Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink (20)

Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
 
State schema evolution for Apache Flink Applications
State schema evolution for Apache Flink ApplicationsState schema evolution for Apache Flink Applications
State schema evolution for Apache Flink Applications
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
 
Extending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session ExtensionsExtending Apache Spark – Beyond Spark Session Extensions
Extending Apache Spark – Beyond Spark Session Extensions
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Streamsets and spark
Streamsets and sparkStreamsets and spark
Streamsets and spark
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
 
From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Cdcr apachecon-talk
Cdcr apachecon-talkCdcr apachecon-talk
Cdcr apachecon-talk
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And Design
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?Event Sourcing - what could possibly go wrong?
Event Sourcing - what could possibly go wrong?
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)
 

Mehr von Flink Forward

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!Flink Forward
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 

Mehr von Flink Forward (20)

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior Detection
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 

Kürzlich hochgeladen

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 

Kürzlich hochgeladen (20)

Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink

  • 1. Tzu-Li (Gordon) Tai Flink Committer / PMC Member tzulitai@apache.org @tzulitai Managing State in Apache Flink
  • 2. Original creators of Apache Flink® Providers of dA Platform 2, including open source Apache Flink + dA Application Manager
  • 3. Users are placing more and more of their most valuable assets within Flink: their application data
  • 4. @ Complete social network implemented using event sourcing and CQRS (Command Query Responsibility Segregation)
  • 5. Classic tiered architecture Streaming architecture database layer compute layer compute + stream storage and snapshot storage (backup) application state
  • 6. Streams and Snapshots 6 Version control Exactly-once guarantees Job reconfiguration Asynchronous, pipelined State Snapshots
  • 7. … want to ensure that users can fully entrust Flink with their state, as well as good quality of life with state management
  • 8. State management FAQs ... ● State declaration best practices? ● How is my state serialized and persisted? ● Can I adapt how my state is serialized? ● Can I adapt the schema / data model of my state? 8
  • 10. Flink Managed State 10 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ...
  • 11. Flink Managed State 11 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... unique identifier for this managed state
  • 12. Flink Managed State 12 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... type information for the state
  • 13. Flink Managed State 13 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... Local State Backend registers state
  • 14. Flink Managed State 14 ... ... Your Code Your Code ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>( “my-value-state”, MyPojo.class ); ValueState<MyPojo> state = getRuntimeContext().getState(desc); MyPojo p = state.value(); ... state.update(...) ... Local State Backend read write
  • 15. Flink Managed State (II) 15 State State State State State State State DFS Checkpoints - Pipelined checkpoint barriers flow through the topology - Operators asynchronously backup their state in checkpoints on a DFS checkpointed state
  • 16. Flink Managed State (II) 16 State State State State State State State DFS Restore - Each operator is assigned their corresponding file handles, from which they restore their state checkpointed state
  • 17. State Serialization Behaviours 17 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization checkpointed state
  • 18. State Serialization Behaviours 18 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization Java object read / writes checkpointed state
  • 19. State Serialization Behaviours 19 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization Java object read / writes serialize on chckpts / savepoints Checkpoint checkpointed state
  • 20. State Serialization Behaviours 20 State State State State State State State DFS JVM Heap backed state backends (MemoryStateBackend, FsStateBackend) ⇒ lazy serialization + eager deserialization deserialize to objects on restore Restore checkpointed state
  • 21. State Serialization Behaviours 21 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization checkpointed state
  • 22. State Serialization Behaviours 22 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write checkpointed state
  • 23. State Serialization Behaviours 23 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write exchange of state bytes Checkpoint checkpointed state
  • 24. State Serialization Behaviours 24 State State State State State State State DFS Out-of-core state backends (RocksDBStateBackend) ⇒ eager serialization + lazy deserialization De-/serialize on every local read / write exchange of state bytes Restore checkpointed state
  • 26. State Access Optimizations 26 ValueStateDescriptor<Map<String, MyPojo>> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); ValueState<Map<String, MyPojo>> state = getRuntimeContext().getState(desc); ● Flink supports different state “structures” for efficient state access for various patterns Map<String, MyPojo> map = state.value(); map.put(“someKey”, new MyPojo(...)); state.update(map); X don’t do
  • 27. State Access Optimizations 27 MapStateDescriptor<String, MyPojo> desc = new MapStateDescriptor<>(“my-value-state”, String.class, MyPojo.class); MapState<String, MyPojo> state = getRuntimeContext().getMapState(desc); ● Flink supports different state “structures” for efficient state access for various patterns MyPojo pojoVal = state.get(“someKey”); state.put(“someKey”, new MyPojo(...)); optimized for random key access
  • 28. Declaration Timeliness 28 ● Try registering state as soon as possible • Typically, all state declaration can be done in the “open()” lifecycle of operators
  • 29. Declaration Timeliness 29 public class MyStatefulMapFunction extends RichMapFunction<String, String> { private static ValueStateDescriptor<MyPojo> DESC = new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class); @Override public String map(String input) { MyPojo pojo = getRuntimeContext().getState(DESC).value(); … } }
  • 30. Declaration Timeliness 30 public class MyStatefulMapFunction extends RichMapFunction<String, String> { private static ValueStateDescriptor<MyPojo> DESC = new ValueStateDescriptor<>(“my-pojo-state”, MyPojo.class); private ValueState<MyPojo> state; @Override public void open(Configuration config) { state = getRuntimeContext().getState(DESC); } @Override public String map(String input) { MyPojo pojo = state.value(); … } }
  • 31. Eager State Declaration 31 ● Under discussion with FLIP-22 ● Allows the JobManager to have knowledge on declared states of a job (under discussion, release TBD)
  • 32. Eager State Declaration 32 (under discussion, release TBD) public class MyStatefulMapFunction extends RichMapFunction<String, String> { @KeyedState( stateId = “my-pojo-state”, queryableStateName = “state-query-handle” ) private MapState<String, MyPojo> state; @Override public String map(String input) { MyPojo pojo = state.get(“someKey”); state.put(“someKey”, new MyPojo(...)); } }
  • 34. How is my state serialized? 34 ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); ● Provided state type class info is analyzed by Flink’s own serialization stack, producing a tailored, efficient serializer ● Supported types: • Primitive types • Tuples, Scala Case Classes • POJOs (plain old Java objects) ● All non-supported types fallback to using Kryo for serialization
  • 35. Avoid Kryo for State Serde 35 ● Kryo is generally not recommended for use on persisted data • Unstable binary formats • Unfriendly for data model changes to the state ● Serialization frameworks with schema evolution support is recommended: Avro, Thrift, etc. ● Register custom Kryo serializers for your Flink job that uses these frameworks
  • 36. Avoid Kryo for State Serde 36 StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // register the serializer included with // Apache Thrift as the standard serializer for your type env.registerTypeWithKryoSerializer(MyCustomType, TBaseSerializer.class); ● Also see: https://goo.gl/oU9NxJ
  • 37. Custom State Serialization 37 ● Instead of supplying type information to be analyzed, directly provide a TypeSerializer public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> { ... } ValueStateDescriptor<MyCustomType> desc = new ValueStateDescriptor<>(“my-value-state”, new MyCustomTypeSerializer());
  • 39. State migration / evolution 39 ● Upgrading to more efficient serialization schemes for performance improvements ● Changing the schema / data model of state types, due to evolving business logic
  • 40. Upgrading State Serializers 40 ● The new upgraded serializer would either be compatible, or not compatible. If incompatible, state migration is required. ● Disclaimer: as of Flink 1.3, upgraded serializers must be compatible, as state migration is not yet an available feature. ValueStateDescriptor<MyCustomType> desc = new ValueStateDescriptor<>(“my-value-state”, new UpgradedSerializer<MyCustomType>()); ValueStateDescriptor<MyPojo> desc = new ValueStateDescriptor<>(“my-value-state”, MyPojo.class); // modified MyPojo type Case #2: New custom serializer Case #1: Modified state types, resulting in different Flink-generated serializers
  • 41. Upgrading State Serializers (II) 41 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “B” Field “A” New serializer at restore time
  • 42. Upgrading State Serializers (II) 42 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “B” Field “A” New serializer at restore time reconfigure
  • 43. Upgrading State Serializers (II) 43 ● On restore, new serializers are checked against the metadata of the previous serializer (stored together with state data in savepoints) for compatibility. ● All Flink-generated serializers define the metadata to be persisted. Newly generated serializers at restore time are reconfigured to be compatible.Savepoint state data serializer config metadataField “A” Field “B” Old serializer at time of savepoint Field “A” Field “B” New serializer at restore time
  • 44. Upgrading State Serializers (III) 44 ● Custom serializers must define the metadata to write: TypeSerializerConfigSnapshot ● Also define logic for compatibility checks public class MyCustomTypeSerializer extends TypeSerializer<MyCustomType> { ... TypeSerializerConfigSnapshot snapshotConfiguration(); CompatibilityResult<MyCustomType> ensureCompatibility( TypeSerializerConfigSnapshot configSnapshot); }
  • 46. TL;DR 46 ● The Flink community takes state management in Flink very seriously ● We try to make sure that users will feel comfortable in placing their application state within Flink and avoid any kind of lock-in. ● Upcoming changes / features to expect related to state management: Eager State Declaration & State Migration