SlideShare ist ein Scribd-Unternehmen logo
1 von 72
Hive Bucketing in
Apache Spark
Tejas Patil
Facebook
Agenda
• Why bucketing ?
• Why is shuffle bad ?
• How to avoid shuffle ?
• When to use bucketing ?
• Spark’s bucketing support
• Bucketing semantics of Spark vs Hive
• Hive bucketing support in Spark
• SQL Planner improvements
Why bucketing ?
Table A Table B
. . . . . . . . . . . .JOIN
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join
- Ship smaller table to all nodes
- Stream the other table
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
Sort merge join
- Shuffle both tables,
- Sort both tables, buffer one,
stream the bigger one
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join Sort merge join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
- Shuffle both tables,
- Sort both tables, buffer one,
stream the bigger one
Table A Table B
. . . . . . . . . . . .
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Sort Sort Sort Sort
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Sort
Join
Sort
Join
Sort
Join
Sort
Join
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Sort Merge Join
Why is shuffle bad ?
Table A Table B
. . . . . . . . . . . .
Disk Disk
Disk IO for shuffle outputs
Why is shuffle bad ?
Table A Table B
. . . . . . . . . . . .
Shuffle Shuffle Shuffle Shuffle
M * R network connections
Why is shuffle bad ?
How to avoid shuffle ?
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
student s JOIN course_registration c
ON s.id = c.student_id
……..
……..
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
Pre-compute at
table creation
Bucketing =
pre-(shuffle + sort) inputs
on join keys
Without
bucketing
Job(s) populating input table
Without
bucketing
With
bucketing
Job(s) populating input table
Performance comparison
0
100
200
300
400
500
600
700
800
900
Before After Bucketing
Reserved CPU time (days)
0
1
2
3
4
5
6
7
8
Before After Bucketing
Latency (hours)
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
…. => Dimension tables
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
• Loading daily cumulative tables
• Both base and delta tables could be bucketed on a common column
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
• Loading daily cumulative tables
• Both base and delta tables could be bucketed on a common column
• Indexing capability
Spark’s bucketing support
Creation of bucketed tables
df.write
.bucketBy(numBuckets, "col1", ...)
.sortBy("col1", ...)
.saveAsTable("bucketed_table")
via Dataframe API
Creation of bucketed tables
CREATE TABLE bucketed_table(
column1 INT,
...
) USING parquet
CLUSTERED BY (column1, …)
SORTED BY (column1, …)
INTO `n` BUCKETS
via SQL statement
Check bucketing spec
scala> sparkContext.sql("DESC FORMATTED student").collect.foreach(println)
[# col_name,data_type,comment]
[student_id,int,null]
[name,int,null]
[# Detailed Table Information,,]
[Database,default,]
[Table,table1,]
[Owner,tejas,]
[Created,Fri May 12 08:06:33 PDT 2017,]
[Type,MANAGED,]
[Num Buckets,64,]
[Bucket Columns,[`student_id`],]
[Sort Columns,[`student_id`],]
[Properties,[serialization.format=1],]
[Serde Library,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,]
[InputFormat,org.apache.hadoop.mapred.SequenceFileInputFormat,]
Bucketing config
SET spark.sql.sources.bucketing.enabled=true
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already bucketed on columns (i, j)
Shuffle on columns (i, j)
Sort on columns (i, j)
Join on [i,j], [i,j] respectively of relations
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already bucketed on columns (i, j)
Shuffle on columns (i, j)
Sort on columns (i, j)
Join on [i,j], [i,j] respectively of relations
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
TableScan TableScan
Sort Merge Join
TableScan TableScanSort
Exchange
Sort
Exchange
Bucketing semantics of
Spark vs Hive
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive
my_bucketed_table
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive Spark
bucket 0 bucket 1
bucket
(n-1)
…..
bucket 0
bucket 0
bucket 1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
my_bucketed_tablemy_bucketed_table
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive
M M M M M
R R R
…..
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive Spark
M M M M M
R R R
…..
bucket 0 bucket 1
bucket
(n-1)
…..
M M M M M…..
bucket 0
bucket 0
bucket 1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Sort ordering Each bucket is sorted globally. This
makes it easy to optimize queries
reading this data
Each file is individually sorted but
the “bucket” as a whole is not
globally sorted
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Sort ordering Each bucket is sorted globally. This
makes it easy to optimize queries
reading this data
Each file is individually sorted but
the “bucket” as a whole is not
globally sorted
Hashing function Hive’s inbuilt hash Murmur3Hash
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
Merged in
upstream
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
FB-prod (6 months)
SQL Planner improvements
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
Sort Merge Join
TableScan TableScan
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
Sort
TableScan
Sort
TableScan Already bucketed on column `col1`
Sort on columns (a.col1 and b.col1)
Join on [col1], [col1] respectively of relations
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
TableScan TableScan Already bucketed on column `col1`
Sort on columns (a.col1 and b.col1)Sort Sort
Join on [col1], [col1] respectively of relations
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
TableScan TableScan
Sort Sort
Sort Merge Join
TableScan TableScan
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort Merge Join
TableScan TableScanSort
Exchange
Sort
Exchange
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
Buffered
relation
Streamed
relation
If there are a lot of rows to be
buffered (eg. Skew),
the query would OOM
row with
key=`foo`
rows with
key=`foo`
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
Buffered
relation
Streamed
relation
row with
key=`foo`
rows with
key=`foo`
Disk
buffer would spill to disk
after reaching a threshold
spill
Summary
• Shuffle and sort is costly due to disk and network IO
• Bucketing will pre-(shuffle and sort) the inputs
• Expect at least 2-5x gains after bucketing input tables for joins
• Candidates for bucketing:
• Tables used frequently in JOINs with same key
• Loading of data in cumulative tables
• Spark’s bucketing is not compatible with Hive’s bucketing
Questions ?

Weitere ähnliche Inhalte

Was ist angesagt?

Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
Jun Rao
 

Was ist angesagt? (20)

Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage Engine
 
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
 
Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)Spark - Alexis Seigneurin (Français)
Spark - Alexis Seigneurin (Français)
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Serializabilityとは何か
Serializabilityとは何かSerializabilityとは何か
Serializabilityとは何か
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
 
How to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They WorkHow to Actually Tune Your Spark Jobs So They Work
How to Actually Tune Your Spark Jobs So They Work
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 

Ähnlich wie Hive Bucketing in Apache Spark

Ähnlich wie Hive Bucketing in Apache Spark (20)

Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
 
Spark SQL Bucketing at Facebook
 Spark SQL Bucketing at Facebook Spark SQL Bucketing at Facebook
Spark SQL Bucketing at Facebook
 
Scala in practice - 3 years later
Scala in practice - 3 years laterScala in practice - 3 years later
Scala in practice - 3 years later
 
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
 
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generation
 
Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Apache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingApache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketing
 
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Database Programming
Database ProgrammingDatabase Programming
Database Programming
 
Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDC
 
Data_base.pptx
Data_base.pptxData_base.pptx
Data_base.pptx
 
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at FacebookScaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICSHive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
 

Kürzlich hochgeladen

AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 

Kürzlich hochgeladen (20)

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

Hive Bucketing in Apache Spark

  • 1. Hive Bucketing in Apache Spark Tejas Patil Facebook
  • 2. Agenda • Why bucketing ? • Why is shuffle bad ? • How to avoid shuffle ? • When to use bucketing ? • Spark’s bucketing support • Bucketing semantics of Spark vs Hive • Hive bucketing support in Spark • SQL Planner improvements
  • 4. Table A Table B . . . . . . . . . . . .JOIN
  • 5. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join - Ship smaller table to all nodes - Stream the other table
  • 6. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one
  • 7. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one Sort merge join - Shuffle both tables, - Sort both tables, buffer one, stream the bigger one
  • 8. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join Sort merge join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one - Shuffle both tables, - Sort both tables, buffer one, stream the bigger one
  • 9. Table A Table B . . . . . . . . . . . . Sort Merge Join
  • 10. Table A Table B . . . . . . . . . . . . Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 11. Table A Table B . . . . . . . . . . . . Sort Sort Sort Sort Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 12. Table A Table B . . . . . . . . . . . . Sort Join Sort Join Sort Join Sort Join Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 14. Why is shuffle bad ?
  • 15. Table A Table B . . . . . . . . . . . . Disk Disk Disk IO for shuffle outputs Why is shuffle bad ?
  • 16. Table A Table B . . . . . . . . . . . . Shuffle Shuffle Shuffle Shuffle M * R network connections Why is shuffle bad ?
  • 17. How to avoid shuffle ?
  • 18.
  • 19. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id student s JOIN course_registration c ON s.id = c.student_id …….. ……..
  • 20. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id
  • 21. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id Pre-compute at table creation
  • 22. Bucketing = pre-(shuffle + sort) inputs on join keys
  • 25. Performance comparison 0 100 200 300 400 500 600 700 800 900 Before After Bucketing Reserved CPU time (days) 0 1 2 3 4 5 6 7 8 Before After Bucketing Latency (hours)
  • 26. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id
  • 27. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id …. => Dimension tables
  • 28. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id • Loading daily cumulative tables • Both base and delta tables could be bucketed on a common column
  • 29. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id • Loading daily cumulative tables • Both base and delta tables could be bucketed on a common column • Indexing capability
  • 31.
  • 32. Creation of bucketed tables df.write .bucketBy(numBuckets, "col1", ...) .sortBy("col1", ...) .saveAsTable("bucketed_table") via Dataframe API
  • 33. Creation of bucketed tables CREATE TABLE bucketed_table( column1 INT, ... ) USING parquet CLUSTERED BY (column1, …) SORTED BY (column1, …) INTO `n` BUCKETS via SQL statement
  • 34. Check bucketing spec scala> sparkContext.sql("DESC FORMATTED student").collect.foreach(println) [# col_name,data_type,comment] [student_id,int,null] [name,int,null] [# Detailed Table Information,,] [Database,default,] [Table,table1,] [Owner,tejas,] [Created,Fri May 12 08:06:33 PDT 2017,] [Type,MANAGED,] [Num Buckets,64,] [Bucket Columns,[`student_id`],] [Sort Columns,[`student_id`],] [Properties,[serialization.format=1],] [Serde Library,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,] [InputFormat,org.apache.hadoop.mapred.SequenceFileInputFormat,]
  • 36. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
  • 37. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join Sort Exchange TableScan Sort Exchange TableScan Already bucketed on columns (i, j) Shuffle on columns (i, j) Sort on columns (i, j) Join on [i,j], [i,j] respectively of relations
  • 38. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join Sort Exchange TableScan Sort Exchange TableScan Already bucketed on columns (i, j) Shuffle on columns (i, j) Sort on columns (i, j) Join on [i,j], [i,j] respectively of relations
  • 39. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join TableScan TableScan Sort Merge Join TableScan TableScanSort Exchange Sort Exchange
  • 41. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive my_bucketed_table
  • 42. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive Spark bucket 0 bucket 1 bucket (n-1) ….. bucket 0 bucket 0 bucket 1 bucket (n-1)bucket (n-1)bucket (n-1) my_bucketed_tablemy_bucketed_table
  • 43. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive M M M M M R R R …..
  • 44. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive Spark M M M M M R R R ….. bucket 0 bucket 1 bucket (n-1) ….. M M M M M….. bucket 0 bucket 0 bucket 1 bucket (n-1)bucket (n-1)bucket (n-1)
  • 45. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier
  • 46. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket
  • 47. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket Sort ordering Each bucket is sorted globally. This makes it easy to optimize queries reading this data Each file is individually sorted but the “bucket” as a whole is not globally sorted
  • 48. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket Sort ordering Each bucket is sorted globally. This makes it easy to optimize queries reading this data Each file is individually sorted but the “bucket” as a whole is not globally sorted Hashing function Hive’s inbuilt hash Murmur3Hash
  • 49. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join
  • 50. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join Merged in upstream
  • 51. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join FB-prod (6 months)
  • 53. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
  • 54. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
  • 55. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 56. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join Exchange TableScan Sort Exchange TableScan Sort Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 57. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join Exchange TableScan Sort Exchange TableScan Sort Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 58. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x Sort Merge Join TableScan TableScan
  • 59. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
  • 60. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join Sort TableScan Sort TableScan Already bucketed on column `col1` Sort on columns (a.col1 and b.col1) Join on [col1], [col1] respectively of relations
  • 61. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join TableScan TableScan Already bucketed on column `col1` Sort on columns (a.col1 and b.col1)Sort Sort Join on [col1], [col1] respectively of relations
  • 62. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join TableScan TableScan Sort Sort Sort Merge Join TableScan TableScan
  • 63. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
  • 64. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 65. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 66. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 67. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Merge Join TableScan TableScanSort Exchange Sort Exchange
  • 69. [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray Buffered relation Streamed relation If there are a lot of rows to be buffered (eg. Skew), the query would OOM row with key=`foo` rows with key=`foo`
  • 70. [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray Buffered relation Streamed relation row with key=`foo` rows with key=`foo` Disk buffer would spill to disk after reaching a threshold spill
  • 71. Summary • Shuffle and sort is costly due to disk and network IO • Bucketing will pre-(shuffle and sort) the inputs • Expect at least 2-5x gains after bucketing input tables for joins • Candidates for bucketing: • Tables used frequently in JOINs with same key • Loading of data in cumulative tables • Spark’s bucketing is not compatible with Hive’s bucketing