2. Agenda
United and the Airline Industry
How Streaming Model Presents
Opportunity
Apache Flink
4 Q & A
3. 2
About United Airlines…..
1,348 aircraft (779 mainline, 569 regional) with 250+ on order (supply chain)
158M passengers in 2018
(public facing web site, mobile app, time / geospatial based inventory, loyalty program, surveys, ancillary sales)
4900 daily departures (scheduling, operations, weather, route planning)
355 airports served, in 48 countries (baggage claim, check-ins)
88,000 employees worldwide (scheduling, pay)
Constantly in motion! Future (and past) always changing.
A data scientist / data engineer dream.
Source: https://hub.united.com/corporate-fact-sheet/
4. 3
Business Goals
Improve Customer Experience
- How can we reduce friction when booking a reservation? Maneuvering through an airport?
- How can we deliver a consistent message across all channels? (mobile app, web site, social media etc)
Improve Employee Experience
- How can we keep employees better informed of the current situation so they can relay it to the customers?
- What are we learning from our surveys about what the customer bases says is / isn’t working?
Revenue Generation
- What personalized offers can we make to our customers?
- Are our offers competitive with the rest of the industry?
Improve Operational Reliability
- How can we better prepare for weather or other operational interruptions?
- How can we manage the fleet better and insure spare parts are where they need to be?
6. 5
Use Case – Improve Customer Experience Via Social Media
Social media represents a unique opportunity for any service company
- Connect with customers in a familiar environment.
- Consistent messaging and brand management.
- Build community and advocacy.
- Direct issues to appropriate channels so they can be handled expediently.
7. 6
Use Case – Customer Experience
Can we use social media as a giant issue tracking database?
Obstacles:
- Who am I talking to?
- Is there an issue? If so, what is the issue?
- What is the current state of the issue? How did it get there?
- Are there any recommendations on how to handle the issue?
- Who is best equipped to handle this issue?
All of these need to be overcome within a few seconds of receiving a notification…
8. 7
Use Case – Customer Experience
Actions
- Identification (Who am I talking to?)
- Classification, prioritization (Is there an issue? What is it? How important is it?)
- State determination (What is the current state of the issue? How did it get there?)
- Recommendation, clustering (Are there any recommendations on how to handle the issue?)
- Routing (Who is best equipped to handle this issue?)
Conclusion: several enrichments + state lookup
Other needs: low latency, fault tolerance, high availability, elasticity…
10. 9
Stream Processing Engine - Enrichment
Enrichment options:
- Option 1: Data lives in an external database or service using a map
- Option 2: Data arrives as a second stream
Option #1:
Social Media Messages
Social Media Messages
Source
Source
Map
(keyBy)
Map
(keyBy)
Map
Map
11. 10
Stream Processing Engine - Enrichment
Option #1 Issues
- Synchronous requests are slow and prone to error, jamming up the pipeline
- Wasted resources while waiting for the service to respond
What about asynchronous?
- AsyncFunction in DataStream API since Flink 1.2
• A queue of promises
• Emitter on a different thread
- Client needs to support async requests
12. 11
Stream Processing Engine - Enrichment
Async call:
DataStream<Tuple2<String, String>> result =
AsyncDataStream.(un)orderedWait(stream,
new MyAsyncFunction(),
1000, TimeUnit.MILLISECONDS, 100)
– our asycFunction
– a timeout: max time until considered failed
– capacity: max number of queued up requests
– unorderedWait: emit results in order of completion
– orderedWait: emit results in order of arrival
Timeout: Exception thrown. Can override exception handler.
Capacity exceeded: back pressure.
13. 12
Stream Processing Engine - Enrichment
Option #2 - joining streams
Social Media Messages Source Map
(keyBy)
Social Media Messages Source Map
(keyBy)
Events Source Map
(keyBy)
Join
14. 13
Stream Processing Engine - Joining
Window join
- Only elements within the same window can be joined
• Tumbling window
• Sliding window
• Session window
- Interval Join
• Common key and where elements of stream B have event timestamps that lie in a relative
time interval to event timestamps of elements in stream A
15. 14
Stream Processing Engine - State
Managing state
- Ability to store and retrieve information about a key.
VS.
Client - Server Stateful Streaming
16. 15
Stream Processing Engine - State
Operate on a key-value pull on a keyed stream
Several possible back ends, all easily configurable at cluster create time:
- Memory (very small state)
- File on disk
- RocksDB (very large state)
Keyed Stream
<Key> <Value>
17. 16
Stream Processing Engine - State
Types of state
- ValueState<T> - use this when the state is a single value
- ListState<T> - use this when the state is a list of items
- ReducingState<T> - single value that represents an aggregation of all values added to state
- AggregatingState<IN, OUT> - similar to ReducingState, the aggregation function can change
based on different inputs types.
- MapState<UK, UV> - mapping. Can use put(UK, UV) or get(UK). Also iterable.
18. 17
Stream Processing Engine – Queryable State
Ability to query state from outside a Flink cluster via an API:
Flink Compute Cluster
Keyed Stream <K>
<V>
19. 18
State - Other Issues
Fault tolerance and high availability: Savepointing
HDFS / S3, etc
20. 19
Stream Processing - Other Issues
Elasticity:
- Flink Active (Flink controls resource allocation) / Reactive (external entity controls resource
allocation) mode.
- FLIP-6
- Idea: cluster manager creates and destroys task managers based on demand.
- Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Containers,
Kubernetes and More - Till Rohrmann
21. 20
Use Case – Customer Experience
Actions
- Identification (Who am I talking to?)
- Classification, prioritization (Is there an issue? What is it?)
- State determination (What is the current state of the issue, and how did it get there?)
- Recommendation, clustering (Are there any recommendations on how to handle the issue?)
- Routing (Who is best equipped to handle this issue?)
Some of these are machine learning / model type applications.
How to switch model versions without interrupting the stream?
- Control Stream!
22. 21
Stream Processing Engine – Interacting With Models
Control Stream:
Social Media Messages Source
Model A
List State
Make sure the output stream contains which model version was used!
Map
(KeyBy)
Control Stream Source
Connect CoFlatMap
Model B
Map
(KeyBy)