SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Dynamic filtering for
Presto join optimisation
Roman Zeyde
Presto Conference Israel 2019
Agenda
Existing join optimization techniques
Dynamic filtering description
Implementation details
Performance analysis
Existing join optimization techniques
Happen during planning phase:
• Join reordering
• Join distribution type (distributed vs. broadcast)
Depend on cost-based optimizer (need column
statistics)
• Should be enabled via session parameters
• Can be estimated using ANALYZE statement
Example: join reordering
SELECT * FROM items JOIN sales ON sales.item_id = items.id;
Prefer keeping the "smaller" table on the right-hand side of the join:
Join
(item_id=id)
Join
(item_id=id)
Scan
sales
Scan
items
Scan
items
Scan
sales
Example: broadcast join
If the right-hand side table is "small", it can be replicated to
all join workers - saving the CPU and network cost of left-
hand side repartitioning:
Join worker
Join worker
Join workerLeft-hand side
Right-hand side
Join worker
Join worker
Example: distributed join
Otherwise, both tables are repartitioned using the join key,
allowing joins with larger right-hand side tables:
Join workerLeft-hand side
Right-hand side
Dynamic filtering - introduction
Consider the following query:
SELECT * FROM sales JOIN items
ON sales.item_id = items.id
WHERE items.price > 1000;
Assumptions:
● sales table is large
● items scan results in a few rows
(due to predicate pushdown)
Most of the scanned sales rows will be
discarded during the join (i.e. high selectivity).
How can we optimize this use-case?
Join
(item_id=id)
Scan
sales
Scan
items
[price>1000]
Dynamic filtering - description
1. Collect relevant id values during items scan
2. Construct dynamic filter F using the collected ids
3. Apply dynamic predicate pushdown using F to
sales scan
Benefits:
• Connector may optimize the scan given F
• Most sales rows are not touched by Presto
• CPU & network savings for large tables
Requirements:
• F cannot be too large (memory-wise)
• F need to "back propagate" into sales scan in runtime
Join
(item_id=id)
Scan
items
[price>1000]
Scan
sales
[item_id∈F]
(3)
(1)
(2)
Construct
dynamic
filter F
Implementation details - Qubole et al.
Supports both distributed and broadcast joins, but
requires significant changes in Presto:
• Add plan nodes and optimizer rules for dynamic
filter collection and application
• New coordinator REST endpoint for dynamic filter
collection from worker nodes.
• Allow connectors to prune partitions during split
generation (when dynamic filter is ready)
More details can be found here:
qubole.com/blog/sql-join-optimizations-qubole-presto
(https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
Implementation - Varada
When broadcast join is used, sales'
ScanFilterAndProject and items' HashBuilder
operators run at the same process:
• Add a "pass-through" operator to collect
build-side ids.
• When ready, pushdown the resulting
predicate F into sales page source.
No changes needed at the planner, optimizer and
coordinator!
Implemented as a patch on top of
github.com/prestosql/presto (currently work-in-
progress).
ScanFilterAndProject
sales
[item_id∈F]
ScanFilterAndProject
items
[price>1000]
Exchange
Exchange
Collect
F:=F∪{id}
LookupJoin
[item_id=id]
HashBuilder
[id]
TaskOutput
Performance analysis - benchmark
Consider the following query (based on TPC-DS sf10000 dataset):
SELECT ss_item_sk FROM store_sales JOIN customer
ON ss_customer_sk = c_customer_sk
WHERE c_customer_id = 'AAAAAAAAMCOOKLCA';
• store_sales contains 27.7B rows
• customer contains 65M rows
• Query result contains 334 rows
Performance analysis - results
Regular join Dynamic filtering improvement
Execution time 25 sec 0.9 sec x27 faster
CPU time 57.4 min 7.8 sec x440 lower
Peak total memory 261 MB 2.2 MB x118 lower
Data read (from connector) 258 GB 3.3 kB x78M lower
Tested on Varada cluster (with CBO enabled):
Up next in Presto improvements
• Distributed Joins - extend dynamic filtering
• Aggregation Pushdown
• Coordinator HA
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilDatabricks
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationDatabricks
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache SparkDatabricks
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsDatabricks
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
 

Was ist angesagt? (20)

Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 

Ähnlich wie Dynamic filtering for presto join optimisation

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache HiveMurtaza Doctor
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3James Jara
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scaleOri Reshef
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...NGA Human Resources
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculatorsolarisyourep
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunk
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateEdi Yanto
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...ijceronline
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...Piyush Kumar
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 reviewManageIQ
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework managermaxonlinetr
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenancenspyrenet
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with HadoopJayant Shekhar
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereSAP Technology
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performanceGuy Harrison
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptxCarl Brundage
 

Ähnlich wie Dynamic filtering for presto join optimisation (20)

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
 
Rough cut connect2-xyz
Rough cut connect2-xyzRough cut connect2-xyz
Rough cut connect2-xyz
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
 
CIM MODULE 2.pptx
CIM MODULE 2.pptxCIM MODULE 2.pptx
CIM MODULE 2.pptx
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced Session
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data Template
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 review
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenance
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with Hadoop
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL Anywhere
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptx
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Dynamic filtering for presto join optimisation

  • 1. Dynamic filtering for Presto join optimisation Roman Zeyde Presto Conference Israel 2019
  • 2. Agenda Existing join optimization techniques Dynamic filtering description Implementation details Performance analysis
  • 3. Existing join optimization techniques Happen during planning phase: • Join reordering • Join distribution type (distributed vs. broadcast) Depend on cost-based optimizer (need column statistics) • Should be enabled via session parameters • Can be estimated using ANALYZE statement
  • 4. Example: join reordering SELECT * FROM items JOIN sales ON sales.item_id = items.id; Prefer keeping the "smaller" table on the right-hand side of the join: Join (item_id=id) Join (item_id=id) Scan sales Scan items Scan items Scan sales
  • 5. Example: broadcast join If the right-hand side table is "small", it can be replicated to all join workers - saving the CPU and network cost of left- hand side repartitioning: Join worker Join worker Join workerLeft-hand side Right-hand side
  • 6. Join worker Join worker Example: distributed join Otherwise, both tables are repartitioned using the join key, allowing joins with larger right-hand side tables: Join workerLeft-hand side Right-hand side
  • 7. Dynamic filtering - introduction Consider the following query: SELECT * FROM sales JOIN items ON sales.item_id = items.id WHERE items.price > 1000; Assumptions: ● sales table is large ● items scan results in a few rows (due to predicate pushdown) Most of the scanned sales rows will be discarded during the join (i.e. high selectivity). How can we optimize this use-case? Join (item_id=id) Scan sales Scan items [price>1000]
  • 8. Dynamic filtering - description 1. Collect relevant id values during items scan 2. Construct dynamic filter F using the collected ids 3. Apply dynamic predicate pushdown using F to sales scan Benefits: • Connector may optimize the scan given F • Most sales rows are not touched by Presto • CPU & network savings for large tables Requirements: • F cannot be too large (memory-wise) • F need to "back propagate" into sales scan in runtime Join (item_id=id) Scan items [price>1000] Scan sales [item_id∈F] (3) (1) (2) Construct dynamic filter F
  • 9. Implementation details - Qubole et al. Supports both distributed and broadcast joins, but requires significant changes in Presto: • Add plan nodes and optimizer rules for dynamic filter collection and application • New coordinator REST endpoint for dynamic filter collection from worker nodes. • Allow connectors to prune partitions during split generation (when dynamic filter is ready) More details can be found here: qubole.com/blog/sql-join-optimizations-qubole-presto (https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
  • 10. Implementation - Varada When broadcast join is used, sales' ScanFilterAndProject and items' HashBuilder operators run at the same process: • Add a "pass-through" operator to collect build-side ids. • When ready, pushdown the resulting predicate F into sales page source. No changes needed at the planner, optimizer and coordinator! Implemented as a patch on top of github.com/prestosql/presto (currently work-in- progress). ScanFilterAndProject sales [item_id∈F] ScanFilterAndProject items [price>1000] Exchange Exchange Collect F:=F∪{id} LookupJoin [item_id=id] HashBuilder [id] TaskOutput
  • 11. Performance analysis - benchmark Consider the following query (based on TPC-DS sf10000 dataset): SELECT ss_item_sk FROM store_sales JOIN customer ON ss_customer_sk = c_customer_sk WHERE c_customer_id = 'AAAAAAAAMCOOKLCA'; • store_sales contains 27.7B rows • customer contains 65M rows • Query result contains 334 rows
  • 12. Performance analysis - results Regular join Dynamic filtering improvement Execution time 25 sec 0.9 sec x27 faster CPU time 57.4 min 7.8 sec x440 lower Peak total memory 261 MB 2.2 MB x118 lower Data read (from connector) 258 GB 3.3 kB x78M lower Tested on Varada cluster (with CBO enabled):
  • 13. Up next in Presto improvements • Distributed Joins - extend dynamic filtering • Aggregation Pushdown • Coordinator HA

Hinweis der Redaktion

  1. CBO is supported today by Presto Hive connector (using Hive statistics).
  2. Since hash-join requires reading the right-hand side table into memory, we would like to estimate the expected sizes' and reorder the join accordingly. It can be done manually - or automatically (using CBO) via connector-provided statistics.
  3. Broadcast join optimization allows to save network cost for LHS repartitioning at the expense of RHS replication. Can be set manually, or via CBO (by enumerating the possible join types and choosing the one with lowest cost).
  4. If we knew the item IDs during the planning, we could use predicate pushdown to propagate them into the connector.
  5. So instead, we need to construct the predicate in run-time (before starting the LHS scan). We re-use existing predicate pushdown mechanism, which allows us to skip most of the LHS table (can be done efficiently in our case).
  6. The coordination problem is much simpler in this case.
  7. Note: we don't know the join key during the plan, so regular predicate pushdown doesn't work. There is a single customer that matches RHS, so the results are highly selective.
  8. Same query, same hardware - without / with dynamic filtering. These results show that dynamic filtering may significantly improve the performance of highly-selective queries, by making relatively small changes in Presto.
  9. We are planning to continue the work on dynamic filtering, as well as adding support for aggregation pushdown and coordinator high-availability.