1. XStream - A Planetary Scale Stream
Processing Platform at Facebook
Shuyi Chen
Software Engineer
Committer of Apache Flink
& Calcite
Aniket Mokashi
Engineering Manager
PMC of Apache Pig & Parquet
PHOTO
2. Agenda
● Stream processing use cases
● Brief history of real-time data processing at Facebook
● Introduction to XStream
● XStream Design Principles
○ Stylus (C++)
○ CoreSQL - Single Dialect across Engines
○ Interpretive execution
○ Vectorized Execution
Stream Processing at Facebook
3. Streaming Data Flow at FB
High Availability and Low Latency are important to our business
Devices
Web
Others
Event Producers
Messaging Pipe = Scribe
Stream Processing
Pipelines
Warehouse, for
retention and
complex queries
Scribe,
messaging
bus
Publish / Serve
Scuba for quick
analytics and
troubleshooting
Services
A typical streaming data flow...
4. Use Cases @ Facebook
Diversity of Use Cases
• Time series analytics – calculate
metrics over time windows and
stream to another Scribe category,
dashboard or Scuba
• Real time dashboards/Scuba –
aggregate and process Scribe to
feed dashboards or another Scribe
category
• Real time metrics – Custom metrics
or triggers for real time monitoring,
notifications or alarms
Stream Analytics
• Clean, enrich, organize, and
transform raw Scribe prior to
loading to warehouse reducing or
eliminating batch ETL steps
• Common built-in operators to
transform, aggregate, and filter
streaming data
• Enable various stages of the
ML life cycle E.g., feature
engineering
• Enable predictive analytics,
fraud detection, real-time
personalization, and other
advanced analytics use cases
Stream Transform Real-Time AI/ML
5. Stream Processing Ecosystem
Read more:
Realtime Data Processing at Facebook
LogDevice · Distributed storage for sequential data
A look at the history
Scribe
Persistent,
distributed
messaging
system
Fully Managed
Stream
Processing
service -
Authored in a
SQL like
language with
UDFs written in
Java
Puma Swift
Simplistic
Stream
processing
library -
Authored in
Python
Stylus
Low level stream
processing
framework in
C++
Laser
Key-value store
on top of
RocksDB for
lookup joins
Scuba
Slice-and-dice
analysis data
store
8. Stream Processing Ecosystem Goals
Then..
● Move Fast
● Ease of use, deployment and debugging
● Monitoring and operations
Now..
● Move fast with stable infrastructure
● Consistent use - low cognitive overhead
● Performance at Scale
● Consolidation
● Intelligent monitoring and low operational costs
Evolution from then to now
9. XStream
One Unified “Fully Managed” Stream Processing Platform
Enable customers to build applications that can react to
events in real-time, produce analytics at source and
minimize the data to insights and actioning cycle.
Mission
10. Why not open source?
On-Prem and Cloud Services Comparison
Auto
Scaling
Load
Balance
Privacy &
Security
Data
Quality
SQL Relational
API
Functional
API
Apache Flink No No No No Flink SQL Flink Table API DataStream
Spark
Streaming
Yes Yes No Deequ
(AWS)
Spark SQL Spark Dataset RDD
Apache Samza Yes
(manual)
Yes No No Samza SQL No Samza
High-level API
Apache Beam Yes Yes No No Beam SQL No PTransform
API
AWS Kinesis
Analytics
Yes Yes Yes No
ANSI 2008 with
extensions
Yes Table API No
Google Cloud
DataFlow
Yes Yes Yes
No (Trifacta
for upfront)
SQL like
non-compliant
Yes No
XStream
Yes Yes Yes No CoreSQL SQL
2016 with
extensions
Fluent API Stylus
functional
11. XStream Overview
• Accelerate Developer Velocity
- Just write business logic and let us manage everything else - Important at FB Scale
- Even more important so developers don’t just focus on KTLO
- Authoring ease / Write once, run many with one SQL dialect shared across batch,
interactive and streaming use cases
• Efficiency and Performance
- ~2x more efficient than predecessor and tighter integration with Native C++ engine
• Fully Managed Service
- No need to worry about scaling, backups, patching etc.
- Managed platform means low ops load and scaling on demand to 10s of thousands of
jobs
Planetary Scale Fully Managed Event Processing System
Why XStream?
12. XStream Engine Overview
• Stylus C++ framework
• Design principles
• Performance study
Query Planner & Optimizer
Language
SQL & DataFrame
Query Runtime
Stylus
13. XStream Engine Overview
● Stylus is a distributed fault-tolerant & scalable stream processing engine at FB
○ Sources/sinks
○ Operators
○ Watermark
○ Checkpoint
○ Trigger & timer
○ State backend
● However
○ It has steep learning curve
○ High maintenance and operational cost
Stylus C++ framework
14. XStream Engine Overview
• We built XStream with C++ on top of the stylus stream processing framework
• We use a common SQL dialect as Spark & Presto
• We adopt interpretation over up-front compilation
• We build & share the vectorized SQL evaluation engine with Presto & Spark
Principles
15. XStream Engine Overview
• C++ based Stylus stream processing framework is mature & widely used in FB
• Java support is very limited in FB
• Many service backend & business logic are written in C++
• Efficiency & performance is an important factor
Why C++ for XStream?
16. CoreSQL
A single dialect for all SQLs in FB
• Offer a single SQL dialect and framework across different tools in FB
• Modernize Presto SQL language with SQL 2016 standard
• Makes moving between different engines easy based off use cases
• Bring other extensions from Streaming, and Graph SQL into this modernized Presto SQL
language.
• Enable UDFs portability across engines
• Open source
XStream integrate CoreSQL by implementing the streaming extension support
• Tumbling window with multi window support
• Sliding window with multi slide & multi window support
• Session window
A common SQL dialects across FB
17. Query Execution (C++)
Up-front compilation
• Planner codegen entire C++ pipeline and
compiles into machine code
- Highly optimized code
• However,
- One binary per pipeline
- Hard to scale operationally
- Reliability concern
- Long build time → low dev efficiency
SELECT SUM(action), type
FROM action_by_type
GROUP BY type
Main.cpp:
from(“action_by_type”)
-> map(...)
-> keyBy(“type”)
->aggregate(“sum(action)”)
->toSink(...)
XStream/Stylus
C++ libraries
Binary
Compile & build
Deploy
18. Query Execution
Interpretation
• Planner generate distributed execution plan
and each C++ worker node interpret the
local plan and execute interpretively.
- No build process needed
- Single engine binary for all pipelines
- Easier to scale operationally
• However, to achieve high
efficiency/performance, interpretation
usually use vector-at-a-time processing on
columnar data representation
- Amortize interpretation overhead
- Hide cache miss latency
- Leverage SIMD support
SELECT SUM(action), type
FROM action_by_type
GROUP BY type
Execution Plan as JSON:
XStreamSource(“action_by_type”)
→ XStreamCalc()
→
XStreamWindowAggregate(“sum(
action)”)
→ XStreamSink(...)
Deploy
Turbine
Engine
Binary
Weekly
release
manage
d by
XStrea
m
19. Velox Overview
New C++ vectorized SQL evaluation engine
Provide universal and state-of-art building blocks for compute
Why?
• Efficiency and Latency
• Consistency
• Reusability and Engineering Efficiency
Goal is to unify eval engines across FB & beyond
• Presto
• Spark
• XStream
• Etc.
Already Open source!
Task, driver
Operators
Expression evaluation
Vectors
20. XStream Engine Stack
Query Planner
Logical plan Physical plan
XStream local planner
XStream Query Runtime
Velox Expression
evaluation
Velox Vectors
XStream Columnar
Source/Sinks/Operators
Stylus framework
CoreSQL Dataframe
XStream local planner
XStream Query Runtime
Stylus framework
21. Performance
• Velox (cpp vs java)
- 2-10x CPU improvements in initial subset of Presto interactive workload evaluated
• XStream stateless workload (interpretive vs compiled)
- With ~100 events of micro-batching, interpretation performs ~ as compilation with
10%-30% memory saving during normal processing
- During catching up lag, interpretation beat compilation by 30-50% in throughput with
same CPU and slightly less memory usage
22. Planetary Scale Stream Processing - The Future
We’re just getting started - Two flavors, one for platform and the other for long-tail!
• Fully Managed Service
- Both PaaS and SaaS
- Full SQL support
- Advanced Streaming Systems
features eg - backfill support
- Cost based optimizer
• Build a portable UDF ecosystem.
Common UDFs that run across FB’s
Data Infrastructure
• Come join us!