SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Vectorized Query Execution in
Apache Spark at Facebook
Chen Yang
Spark Summit | 04/24/2019
About me
Chen Yang
• Software Engineer at Facebook (Data Warehouse Team)
• Working on Spark execution engine improvements
• Worked on Hive and ORC in the past
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
• Largest SQL query engine at Facebook (by CPU usage)
• We use Spark on disaggregated compute/storage clusters
• Scale/upgrade clusters independently
• Efficiency is top priority for Spark at Facebook given the scale
• Compute efficiency: Optimize CPU and Memory usage
• Storage efficiency: Optimize on disk size and IOPS
Spark at Facebook
• Compute efficiency : Optimize CPU and Memory usage
• Significant percentage (>40%) of CPU time is spent in reading/writing
• Storage efficiency : Optimize on disk size and IOPS
• Storage format have big impact on on-disk size and IOPS
• Facebook data warehouse use ORC format
Spark at Facebook
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
Row format vs Columnar format
Row format
user_id os
1 android
1 ios
3 ios
6 android
1 android 1 ios 3 ios 6 android
1 1 3 6 android ios ios android
Columnar format
Logical table
Row format vs Columnar format
Row format
1 android 1 ios 3 ios 6 android
1 1 3 6 android ios ios android
Columnar format
Logical table
user_id os
1 android
1 ios
3 ios
6 android
On disk row format:
csv
In memory row format:
UnsafeRow, OrcLazyRow
Row format vs Columnar format
Row format
1 android 1 ios 3 ios 6 android
1 1 3 6 android ios ios android
Columnar format
Logical table
user_id os
1 android
1 ios
3 ios
6 android
On disk row format:
csv
On disk columnar format:
Parquet, ORC
In memory row format:
UnsafeRow, OrcLazyRow
In memory columnar format:
Arrow, VectorizedRowBatch
Columnar on disk format
• Better compression ratio
• Type specific encoding
• Minimize I/O
• Projection push down (column pruning)
• Predicate push down (filtering based on stats)
1 1 3 6 android ios ios android
Columnar format
Logical table
user_id os
1 android
1 ios
3 ios
6 android
Better compression ratio
• Encoding
• Dictionary encoding
• Run Length encoding
• Delta encoding
• Compression
• zstd
• zlib
• snappy
android ios ios android
0 1 1 0
0 android
1 ios
Dictionary encoding
Minimize I/O
• Projection push down
• Column pruning
• Predicate push down
• Filtering based on stats
user_id os
1 android
1 ios
3 ios
6 android
user_id os
1 android
1 ios
3 ios
6 android
user_id os
1 android
1 ios
3 ios
6 android
Column pruning Filtering based on stats Minimize I/O
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
ORC in open source vs at Facebook
Open source:
• ORC(Optimized Row Columnar) is a self-describing type-aware
columnar file format designed for Hadoop workloads.
• It is optimized for large streaming reads, but with integrated support
for finding required rows quickly.
Facebook:
• Fork of Apache ORC, Also called DWRF
• Various perf improvements (IOPS, memory etc)
• Some special format for specific use case in Facebook (FlatMap)
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Row-at-a-time vs vector-at-a-time
user_id os
1 android
1 ios
3 ios
6 android
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Row-at-a-time processing
Orc Reader
Row
Orc Writer
Row
user_id os
1 android
1 ios
3 ios
6 android
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Row-at-a-time processing
Orc Reader
OrcLazyRow
Orc Writer
UnsafeRow
UnsafeRow
OrcRow
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Row-at-a-time processing
private void agg_doAggregateWithoutKey() {
while (inputadapter_input.hasNext()) {
inputadapter_row = inputadapter_input.next();
value = inputadapter_row.getLong(0);
// do aggr
}
}
Orc Reader
OrcLazyRow
Orc Writer
UnsafeRow
UnsafeRow
OrcRow
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Vector-at-a-time processing
Orc Reader
Vector
Orc Writer
Vector
user_id os
1 android
1 ios
3 ios
6 android
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Vector-at-a-time processing
1 1 3 6
VectorizedRowBatch
Orc Reader
VectorizedRowBatch
Orc Writer
VectorizedRowBatch
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id = 3
Vector-at-a-time processing
Orc Reader
VectorizedRowBatch
Orc Writer
VectorizedRowBatch
private void agg_doAggregateWithoutKey() {
while (orc_scan_batch != null) {
int numRows = orc_scan_batch.size;
while (orc_scan_batch_idx < numRows) {
value = orc_scan_col_0.vector[orc_scan_row_idx];
orc_scan_batch_idx++;
// do aggr
}
nextBatch();
}
}
private void nextBatch() {
if (orc_scan_input.hasNext()) {
batch = orc_scan_input.next();
col_0 = orc_scan_batch.cols[0];
}
}
1 1 3 6
VectorizedRowBatch
Row-at-a-time vs vector-at-a-time
private void agg_doAggregateWithoutKey() {
while (inputadapter_input.hasNext()) {
row = inputadapter_input.next();
value = inputadapter_row.getLong(0);
// do aggr
}
}
private void agg_doAggregateWithoutKey() {
while (orc_scan_batch != null) {
int numRows = orc_scan_batch.size;
while (orc_scan_batch_idx < numRows) {
value = col_0.vector[orc_scan_row_idx];
orc_scan_batch_idx++;
// do aggr
}
nextBatch();
}
}
private void nextBatch() {
if (orc_scan_input.hasNext()) {
batch = orc_scan_input.next();
col_0 = orc_scan_batch.cols[0];
}
}
• Lower overhead per row
• Avoid virtual function
dispatch cost per row
• Better cache locality
• Avoid unnecessary
copy/conversion
• No need to convert between
OrcLazyRow and UnsafeRow
Row-at-a-time processing Vector-at-a-time processing
Row-at-a-time vs vector-at-a-time
• Lower overhead per row
• Avoid virtual function
dispatch cost per row
• Better cache locality
• Avoid unnecessary
copy/conversion
• No need to convert between
OrcLazyRow and UnsafeRow
Orc Reader
RowBatch
Orc Reader
OrcLazyRow
UnsafeRow
WholeStage
CodeGen
WholeStage
CodeGen
Row-at-a-time processing Vector-at-a-time processing
Vectorized ORC reader/writer in open source vs at
Facebook
Open source:
• Vectorized reader only support simple type
• Vectorized writer is not supported
Facebook:
• Vectorized reader support all types, integrated with codegen
• Vectorized writer support all types, plan to integrated with codegen
Vector-at-a-time + Whole Stage CodeGen
• Currently only HiveTableScan and InsertIntoHiveTable understands
ORC columnar format
• Most of Spark Operators still process one row at a time
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
Vectorized reader + vector-at-a-time microbenchmark
Up to 8x speed up when reading 10M row from a single column table
Vectorized reader and writer + vector-at-a-time microbenchmark
Up to 3.5x speed up when reading and writing 10M row from a single column table
Agenda
• Spark at Facebook
• Row format vs Columnar format
• Row-at-a-time vs Vector-at-a-time processing
• Performance results
• Road ahead
Road ahead
• SIMD/Avoid branching/prefetching
• Optimize codegen code to trigger auto-
vectorization in JVM
• Customize JVM to make better compiler
optimization decisions for Spark
1 1 3 6
SELECT
COUNT(*)
FROM user_os_info
WHERE user_id > 1
1 1 1 1
0 0 1 1
> > > >
↓ ↓ ↓ ↓
⟶ 2
Road ahead
• Efficient in memory data format
• Use Apache Arrow as in memory format
• Encoded columnar in memory format
• Use columnar format for shuffle
• Filter/projection push down to reader
• Avoid decoding cost
android ios ios android
0 1 1 0
0 android
1 ios
Dictionary encoding
SELECT
COUNT(*)
FROM user_os_info
WHERE os = ‘ios’
INFRASTRUCTURE

Weitere ähnliche Inhalte

Was ist angesagt?

The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQLDatabricks
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemDatabricks
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDatabricks
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Julien Le Dem
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsDatabricks
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Databricks
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLScyllaDB
 

Was ist angesagt? (20)

The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1
 
Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 

Ähnlich wie Vectorized Query Execution in Apache Spark at Facebook

Strata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmapStrata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmapJulien Le Dem
 
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...Malin Weiss
 
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...Speedment, Inc.
 
Powering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script TransformationPowering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script TransformationDatabricks
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
The Characteristics of a Successful SPA
The Characteristics of a Successful SPAThe Characteristics of a Successful SPA
The Characteristics of a Successful SPAGil Fink
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Klas Berlič Fras
 
Punta Dreamin 17 Generic Apex and Tooling Api
Punta Dreamin 17 Generic Apex and Tooling ApiPunta Dreamin 17 Generic Apex and Tooling Api
Punta Dreamin 17 Generic Apex and Tooling ApiAdam Olshansky
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodDatabricks
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSPC Adriatics
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingStreamNative
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planningasya999
 
David Bilík: Anko – modern way to build your layouts?
David Bilík: Anko – modern way to build your layouts?David Bilík: Anko – modern way to build your layouts?
David Bilík: Anko – modern way to build your layouts?mdevtalk
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APIshareddatamsft
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Databricks
 
Node.js and couchbase Full Stack JSON - Munich NoSQL
Node.js and couchbase   Full Stack JSON - Munich NoSQLNode.js and couchbase   Full Stack JSON - Munich NoSQL
Node.js and couchbase Full Stack JSON - Munich NoSQLPhilipp Fehre
 

Ähnlich wie Vectorized Query Execution in Apache Spark at Facebook (20)

Strata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmapStrata NY 2017 Parquet Arrow roadmap
Strata NY 2017 Parquet Arrow roadmap
 
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
 
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
 
Powering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script TransformationPowering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script Transformation
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
The Characteristics of a Successful SPA
The Characteristics of a Successful SPAThe Characteristics of a Successful SPA
The Characteristics of a Successful SPA
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)
 
Remix
RemixRemix
Remix
 
Punta Dreamin 17 Generic Apex and Tooling Api
Punta Dreamin 17 Generic Apex and Tooling ApiPunta Dreamin 17 Generic Apex and Tooling Api
Punta Dreamin 17 Generic Apex and Tooling Api
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi Vončina
 
A Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and ProcessingA Unified Platform for Real-time Storage and Processing
A Unified Platform for Real-time Storage and Processing
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
 
David Bilík: Anko – modern way to build your layouts?
David Bilík: Anko – modern way to build your layouts?David Bilík: Anko – modern way to build your layouts?
David Bilík: Anko – modern way to build your layouts?
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp API
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
Node.js and couchbase Full Stack JSON - Munich NoSQL
Node.js and couchbase   Full Stack JSON - Munich NoSQLNode.js and couchbase   Full Stack JSON - Munich NoSQL
Node.js and couchbase Full Stack JSON - Munich NoSQL
 

Mehr von Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mehr von Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Kürzlich hochgeladen

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Kürzlich hochgeladen (20)

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Vectorized Query Execution in Apache Spark at Facebook

  • 1. Vectorized Query Execution in Apache Spark at Facebook Chen Yang Spark Summit | 04/24/2019
  • 2. About me Chen Yang • Software Engineer at Facebook (Data Warehouse Team) • Working on Spark execution engine improvements • Worked on Hive and ORC in the past
  • 3. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 4. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 5. • Largest SQL query engine at Facebook (by CPU usage) • We use Spark on disaggregated compute/storage clusters • Scale/upgrade clusters independently • Efficiency is top priority for Spark at Facebook given the scale • Compute efficiency: Optimize CPU and Memory usage • Storage efficiency: Optimize on disk size and IOPS Spark at Facebook
  • 6. • Compute efficiency : Optimize CPU and Memory usage • Significant percentage (>40%) of CPU time is spent in reading/writing • Storage efficiency : Optimize on disk size and IOPS • Storage format have big impact on on-disk size and IOPS • Facebook data warehouse use ORC format Spark at Facebook
  • 7. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 8. Row format vs Columnar format Row format user_id os 1 android 1 ios 3 ios 6 android 1 android 1 ios 3 ios 6 android 1 1 3 6 android ios ios android Columnar format Logical table
  • 9. Row format vs Columnar format Row format 1 android 1 ios 3 ios 6 android 1 1 3 6 android ios ios android Columnar format Logical table user_id os 1 android 1 ios 3 ios 6 android On disk row format: csv In memory row format: UnsafeRow, OrcLazyRow
  • 10. Row format vs Columnar format Row format 1 android 1 ios 3 ios 6 android 1 1 3 6 android ios ios android Columnar format Logical table user_id os 1 android 1 ios 3 ios 6 android On disk row format: csv On disk columnar format: Parquet, ORC In memory row format: UnsafeRow, OrcLazyRow In memory columnar format: Arrow, VectorizedRowBatch
  • 11. Columnar on disk format • Better compression ratio • Type specific encoding • Minimize I/O • Projection push down (column pruning) • Predicate push down (filtering based on stats) 1 1 3 6 android ios ios android Columnar format Logical table user_id os 1 android 1 ios 3 ios 6 android
  • 12. Better compression ratio • Encoding • Dictionary encoding • Run Length encoding • Delta encoding • Compression • zstd • zlib • snappy android ios ios android 0 1 1 0 0 android 1 ios Dictionary encoding
  • 13. Minimize I/O • Projection push down • Column pruning • Predicate push down • Filtering based on stats user_id os 1 android 1 ios 3 ios 6 android user_id os 1 android 1 ios 3 ios 6 android user_id os 1 android 1 ios 3 ios 6 android Column pruning Filtering based on stats Minimize I/O SELECT COUNT(*) FROM user_os_info WHERE user_id = 3
  • 14. ORC in open source vs at Facebook Open source: • ORC(Optimized Row Columnar) is a self-describing type-aware columnar file format designed for Hadoop workloads. • It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Facebook: • Fork of Apache ORC, Also called DWRF • Various perf improvements (IOPS, memory etc) • Some special format for specific use case in Facebook (FlatMap)
  • 15. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 16. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Row-at-a-time vs vector-at-a-time user_id os 1 android 1 ios 3 ios 6 android
  • 17. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Row-at-a-time processing Orc Reader Row Orc Writer Row user_id os 1 android 1 ios 3 ios 6 android
  • 18. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Row-at-a-time processing Orc Reader OrcLazyRow Orc Writer UnsafeRow UnsafeRow OrcRow
  • 19. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Row-at-a-time processing private void agg_doAggregateWithoutKey() { while (inputadapter_input.hasNext()) { inputadapter_row = inputadapter_input.next(); value = inputadapter_row.getLong(0); // do aggr } } Orc Reader OrcLazyRow Orc Writer UnsafeRow UnsafeRow OrcRow
  • 20. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Vector-at-a-time processing Orc Reader Vector Orc Writer Vector user_id os 1 android 1 ios 3 ios 6 android
  • 21. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Vector-at-a-time processing 1 1 3 6 VectorizedRowBatch Orc Reader VectorizedRowBatch Orc Writer VectorizedRowBatch
  • 22. SELECT COUNT(*) FROM user_os_info WHERE user_id = 3 Vector-at-a-time processing Orc Reader VectorizedRowBatch Orc Writer VectorizedRowBatch private void agg_doAggregateWithoutKey() { while (orc_scan_batch != null) { int numRows = orc_scan_batch.size; while (orc_scan_batch_idx < numRows) { value = orc_scan_col_0.vector[orc_scan_row_idx]; orc_scan_batch_idx++; // do aggr } nextBatch(); } } private void nextBatch() { if (orc_scan_input.hasNext()) { batch = orc_scan_input.next(); col_0 = orc_scan_batch.cols[0]; } } 1 1 3 6 VectorizedRowBatch
  • 23. Row-at-a-time vs vector-at-a-time private void agg_doAggregateWithoutKey() { while (inputadapter_input.hasNext()) { row = inputadapter_input.next(); value = inputadapter_row.getLong(0); // do aggr } } private void agg_doAggregateWithoutKey() { while (orc_scan_batch != null) { int numRows = orc_scan_batch.size; while (orc_scan_batch_idx < numRows) { value = col_0.vector[orc_scan_row_idx]; orc_scan_batch_idx++; // do aggr } nextBatch(); } } private void nextBatch() { if (orc_scan_input.hasNext()) { batch = orc_scan_input.next(); col_0 = orc_scan_batch.cols[0]; } } • Lower overhead per row • Avoid virtual function dispatch cost per row • Better cache locality • Avoid unnecessary copy/conversion • No need to convert between OrcLazyRow and UnsafeRow Row-at-a-time processing Vector-at-a-time processing
  • 24. Row-at-a-time vs vector-at-a-time • Lower overhead per row • Avoid virtual function dispatch cost per row • Better cache locality • Avoid unnecessary copy/conversion • No need to convert between OrcLazyRow and UnsafeRow Orc Reader RowBatch Orc Reader OrcLazyRow UnsafeRow WholeStage CodeGen WholeStage CodeGen Row-at-a-time processing Vector-at-a-time processing
  • 25. Vectorized ORC reader/writer in open source vs at Facebook Open source: • Vectorized reader only support simple type • Vectorized writer is not supported Facebook: • Vectorized reader support all types, integrated with codegen • Vectorized writer support all types, plan to integrated with codegen
  • 26. Vector-at-a-time + Whole Stage CodeGen • Currently only HiveTableScan and InsertIntoHiveTable understands ORC columnar format • Most of Spark Operators still process one row at a time
  • 27. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 28. Vectorized reader + vector-at-a-time microbenchmark Up to 8x speed up when reading 10M row from a single column table
  • 29. Vectorized reader and writer + vector-at-a-time microbenchmark Up to 3.5x speed up when reading and writing 10M row from a single column table
  • 30. Agenda • Spark at Facebook • Row format vs Columnar format • Row-at-a-time vs Vector-at-a-time processing • Performance results • Road ahead
  • 31. Road ahead • SIMD/Avoid branching/prefetching • Optimize codegen code to trigger auto- vectorization in JVM • Customize JVM to make better compiler optimization decisions for Spark 1 1 3 6 SELECT COUNT(*) FROM user_os_info WHERE user_id > 1 1 1 1 1 0 0 1 1 > > > > ↓ ↓ ↓ ↓ ⟶ 2
  • 32. Road ahead • Efficient in memory data format • Use Apache Arrow as in memory format • Encoded columnar in memory format • Use columnar format for shuffle • Filter/projection push down to reader • Avoid decoding cost android ios ios android 0 1 1 0 0 android 1 ios Dictionary encoding SELECT COUNT(*) FROM user_os_info WHERE os = ‘ios’