SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Query Optimization
Succinctly: Making the execution of queries optimally fast
Brandon Latronica - 2017
The pathway of a database command:
The pathway of a database command:
Query?
Just a request information
from a database.
We can use a language to
do it...name the most
famous for a DBMS...SQL!
The pathway of a database command:
Parsing and translation:
translate the query into its
internal form. This is then
translated into relational
algebra. Parser checks
syntax, verifies relations
The pathway of a database command:
Relational Algebra:
The conversion of query
syntax (SQL, etc) into some
type of internal, DBMS
relational algebra. Why?
CAS systems are easier for
computers, while words are
better for humans!
The pathway of a database command:
Optimization:
Last stop. The best plan is
determined and then is
pushed for execution via
database calls which results
in a query output.
Basic Overview of Query Optimization
● Cost difference between evaluation plans for a query can be huge!
● Sometimes seconds vs. days!
● Steps in cost-based query optimization
1. Generate logically equivalent expressions using equivalence rules (Car travel paths)
2. Annotate resultant expressions to get alternative query plans
3. Choose the cheapest plan based on estimated cost
● Estimation of plan cost based on:
● Statistical information about relations. Eg: number of tuples, number of distinct values for an
attribute,etc.
● Statistics estimation for intermediate results to compute cost of complex expressions
● Cost formulae for algorithms, computed using statistics
Seconds vs days?
Query Optimization - Yannis E. Ioannidis
Seconds vs days?
Algebra on real numbers? We know that.
And so on...
(x2
+ 2x - 8) / (x - 2) = ?
= (x - 2)(x + 4) / (x - 2)
= 1 * (x + 4)
= (x + 4)
(8,6) : (2,1)
Many terms and operations, to few.
Boolean Algebra? We know that.
AB V ( BC(B V C) ) = ?
= AB V BBC V BCC [Distrib]
= AB V BC V BC [Idem]
= AB V BC [- Distrib]
= B(A V C)
(6,5) : (3,2)
Many terms and operations, to
few.
σ : Select operator that selects specific filters requirements for information --
Ex: σname = "Dan" (customer) will select all in customer whose name that match "Dan".
Π : Project operator that displays all information from a specified area areas --
Ex: Πname, balance(customer) will show all names and balances from customer
In a relation table, a PROJECT eliminates columns while SELECT eliminates rows!
Natural join (⋈ ) is a binary operator that is written as (R S) where R and S are⋈
relations. The result of the natural join is the set of all combinations of tuples(that is,
ordered lists) in R and S that are equal on their common attribute names.
Theta join ( ⋈ θ ) is an operation that consists of all combinations of tuples in R and S
that satisfy θ. The result of the θ-join is defined only if the headers of S and R are
disjoint, that is, do not contain a common attribute.
Relational Algebra in a DB; some operators.
Algebra Transformations
Two relational algebra expressions are said to be equivalent if the two expressions generate
the same set of tuples on every legal database instance
● Note: order of tuples is irrelevant
● we don’t care if they generate different results on databases that violate integrity
constraints
An equivalence rule says that expressions of two forms are equivalent
● Can replace expression of first form by second, or vice versa
Equivalence rules - Relational Algebra
1. Conjunctive selection operations can be deconstructed into a sequence of
individual selections.
2. Selection operations are commutative.
3. Only the last in a sequence of projection operations is needed, the others can be
omitted.
4. Selections can be combined with Cartesian products and theta joins.
a. σθ(E1 X E2) = E1 θ E2
b. σθ1(E1 θ2 E2) = E1 θ1 θ2∧ E2
Equivalence rules - more.
5. Theta-join operations (and natural joins) are commutative.
E1 θ E2 = E2 θ E1
6. (a) Natural join operations are associative:
(E1 E2) E3 = E1 (E2 E3)
(b) Theta joins are associative in the following manner:
(E1 θ1 E2) θ2 θ∧ 3 E3 = E1 θ1 θ∧ 3 (E2 θ2 E3)
where θ2 involves attributes from only E2 and E3.
And many more...
Pictorial Reduction Example
Πname, title(σdept_name= “Music”∧year = 2009 (instructor (teaches Πcourse_id, title (course))))
Good ways to order the joins?
● For all relations r1, r2, and r3,
(r1 r2) r3 = r1 (r2 r3 )
(Join Associativity)
● If r2 r3 is quite large and r1 r2 is small, we choose (r1 r2) r3
so that we compute and store a smaller temporary relation.
Good rule to always avoid Cartesian products when searching for an optimal plan.
Counting alternative plans: How many?
● Query optimizers use equivalence rules to systematically generate expressions equivalent to
the given expression
● Can generate all equivalent expressions as follows:
● Repeat until no new equivalent expressions are generated:
*Apply all applicable equivalence rules on every subexpression of every equivalent
expression which is found.
*Add newly generated expressions to the set of equivalent expressions
● The above approach is expensive in terms of memory and compute time
● Two approaches:
-Optimized plan generation based on transformation rules.
-Special case approach for queries with only selections, projections and joins.
● Consider finding the best join-order for r1 r2 . . . rn.
● There are (bushy tree) (2(n – 1))!/(n – 1)! different join orders for above expression. With
n = 7, the number is 665280, with n = 10, the number is greater than 176 billion!
● No need to generate all the join orders. Using dynamic programming, the least-cost join
order for any subset of {r1, r2, . . . rn} is computed only once and stored for future use.
Dynamic programming? A method for solving a complex problem by breaking it down into
a collection of simpler subproblems, solving each of those subproblems just once, and
storing their solutions – ideally, using a memory-based data structure. The next time the
same subproblem occurs, instead of recomputing its solution, one simply looks up the
previously computed solution.
Practical query optimizers incorporate elements of the following two broad approaches:
1. Search all the plans and choose the best plan in a cost-based fashion. Use dynamic
programing to store and recall past found optimal plans and subplans!
2. Uses heuristics to choose a plan.
Cost and choice.
Heuristics: Being Pragmatic
● Cost-based optimization is expensive, even with dynamic programming.
● Systems may use heuristics to reduce the number of choices that must be made in a cost-
based fashion.
● Heuristic optimization transforms the query-tree by using a set of rules that typically (but not
in all cases) improve execution performance:
● Perform selection early (reduces the number of tuples)
● Perform projection early (reduces the number of attributes)
● Perform most restrictive selection and join operations (i.e. with smallest result size)
before other similar operations.
● Some systems use only heuristics, others combine heuristics with partial cost-based
optimization.
Statistical estimation of data size?
What if, on a particular database relation, we stored information regarding the content of
the table?
Database indices?
A database index is a data structure that improves the speed of
data retrieval operations on a database table at the cost of
additional writes and storage space to maintain the index data
structure. Indexes are used to quickly locate data without
having to search every row in a database table every time a
database table is accessed.
A B+ tree is an n-ary tree with a variable but often large
number of children per node. A B+ tree consists of a root,
internal nodes and leaves.[1] The root may be either a leaf or a
node with two or more children.
Hash table is a data structure used to implement an
associative array, a structure that can map keys to values. A
hash table uses a hash function to compute an index into an
array of buckets or slots, from which the desired value can be
found.
O log(n)
O (1)
Other Indices and reductions
Offset B+- Tree is a type of index used to park new and update data on table set outside the body of the
main table. Offset table << Main Table! Used with columnar TBAT files
Why not columnar DBs? TBAT are.

Weitere ähnliche Inhalte

Was ist angesagt?

13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMSkoolkampus
 
Query optimization
Query optimizationQuery optimization
Query optimizationPooja Dixit
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)Ravinder Kamboj
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance TuningBala Subra
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimizationWBUTTUTORIALS
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query OptimizationRavinder Kamboj
 
Database : Relational Data Model
Database : Relational Data ModelDatabase : Relational Data Model
Database : Relational Data ModelSmriti Jain
 
SQL Joins and Query Optimization
SQL Joins and Query OptimizationSQL Joins and Query Optimization
SQL Joins and Query OptimizationBrian Gallagher
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaEdureka!
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimizationBaohua Cai
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
 
12. oracle database architecture
12. oracle database architecture12. oracle database architecture
12. oracle database architectureAmrit Kaur
 

Was ist angesagt? (20)

13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
 
Measures of query cost
Measures of query costMeasures of query cost
Measures of query cost
 
Ddbms1
Ddbms1Ddbms1
Ddbms1
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query Optimization
 
Query processing
Query processingQuery processing
Query processing
 
Database : Relational Data Model
Database : Relational Data ModelDatabase : Relational Data Model
Database : Relational Data Model
 
Chapter15
Chapter15Chapter15
Chapter15
 
Query processing
Query processingQuery processing
Query processing
 
SQL Joins and Query Optimization
SQL Joins and Query OptimizationSQL Joins and Query Optimization
SQL Joins and Query Optimization
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | Edureka
 
Join operation
Join operationJoin operation
Join operation
 
Lec 7 query processing
Lec 7 query processingLec 7 query processing
Lec 7 query processing
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimization
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
 
12. oracle database architecture
12. oracle database architecture12. oracle database architecture
12. oracle database architecture
 
Databases: Normalisation
Databases: NormalisationDatabases: Normalisation
Databases: Normalisation
 

Andere mochten auch

Timestamped Binary Association Table - IEEE Big Data Congress 2015
Timestamped Binary Association Table - IEEE Big Data Congress 2015Timestamped Binary Association Table - IEEE Big Data Congress 2015
Timestamped Binary Association Table - IEEE Big Data Congress 2015"FENG "GEORGE"" YU
 
Meeting the challenges of globalization3
Meeting the challenges of globalization3Meeting the challenges of globalization3
Meeting the challenges of globalization3Brian Berger
 
company profile for SH
company profile for SHcompany profile for SH
company profile for SHMadiha Asif
 
Alba 111 modelo rendición cuentas
Alba 111 modelo rendición cuentasAlba 111 modelo rendición cuentas
Alba 111 modelo rendición cuentasALBA SUAREZ
 
VoiceThread Basics
VoiceThread BasicsVoiceThread Basics
VoiceThread Basicstevansb
 
Mk dasar2 komputer
Mk dasar2 komputerMk dasar2 komputer
Mk dasar2 komputerricazuma
 
Etica apresentação
Etica  apresentaçãoEtica  apresentação
Etica apresentaçãoLurdes Pinto
 
Nueva arquitectura al servicio del hombre
Nueva arquitectura al servicio del hombreNueva arquitectura al servicio del hombre
Nueva arquitectura al servicio del hombreluisannacc
 
Evolución de las telecomunicaciones
Evolución de las telecomunicacionesEvolución de las telecomunicaciones
Evolución de las telecomunicacionesAlberto Garcia
 

Andere mochten auch (16)

Timestamped Binary Association Table - IEEE Big Data Congress 2015
Timestamped Binary Association Table - IEEE Big Data Congress 2015Timestamped Binary Association Table - IEEE Big Data Congress 2015
Timestamped Binary Association Table - IEEE Big Data Congress 2015
 
Meeting the challenges of globalization3
Meeting the challenges of globalization3Meeting the challenges of globalization3
Meeting the challenges of globalization3
 
company profile for SH
company profile for SHcompany profile for SH
company profile for SH
 
Kiran DBA
Kiran DBAKiran DBA
Kiran DBA
 
Fintech Infographic- Spain November 2015
Fintech Infographic- Spain November 2015Fintech Infographic- Spain November 2015
Fintech Infographic- Spain November 2015
 
Open Learning Pitch
Open Learning PitchOpen Learning Pitch
Open Learning Pitch
 
Alba 111 modelo rendición cuentas
Alba 111 modelo rendición cuentasAlba 111 modelo rendición cuentas
Alba 111 modelo rendición cuentas
 
behavioural_term_paper
behavioural_term_paperbehavioural_term_paper
behavioural_term_paper
 
Señales preventivas
Señales preventivasSeñales preventivas
Señales preventivas
 
VoiceThread Basics
VoiceThread BasicsVoiceThread Basics
VoiceThread Basics
 
Mk dasar2 komputer
Mk dasar2 komputerMk dasar2 komputer
Mk dasar2 komputer
 
05 e
05 e05 e
05 e
 
Etica apresentação
Etica  apresentaçãoEtica  apresentação
Etica apresentação
 
Resume
ResumeResume
Resume
 
Nueva arquitectura al servicio del hombre
Nueva arquitectura al servicio del hombreNueva arquitectura al servicio del hombre
Nueva arquitectura al servicio del hombre
 
Evolución de las telecomunicaciones
Evolución de las telecomunicacionesEvolución de las telecomunicaciones
Evolución de las telecomunicaciones
 

Ähnlich wie Query Optimization - Brandon Latronica

Transaction Management, Recovery and Query Processing.pptx
Transaction Management, Recovery and Query Processing.pptxTransaction Management, Recovery and Query Processing.pptx
Transaction Management, Recovery and Query Processing.pptxRoshni814224
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query ExecutionJ Singh
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
1b_query_optimization_sil_7ed_ch16.ppt
1b_query_optimization_sil_7ed_ch16.ppt1b_query_optimization_sil_7ed_ch16.ppt
1b_query_optimization_sil_7ed_ch16.pptArunachalamSelva
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues listsJames Wong
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
StacksqueueslistsFraboni Ec
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsYoung Alista
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsTony Nguyen
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsHarry Potter
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Avelin Huo
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsRajendran
 
Adbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationAdbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationVaibhav Khanna
 
Bsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureBsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureRai University
 

Ähnlich wie Query Optimization - Brandon Latronica (20)

Transaction Management, Recovery and Query Processing.pptx
Transaction Management, Recovery and Query Processing.pptxTransaction Management, Recovery and Query Processing.pptx
Transaction Management, Recovery and Query Processing.pptx
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Query trees
Query treesQuery trees
Query trees
 
Java 8
Java 8Java 8
Java 8
 
1b_query_optimization_sil_7ed_ch16.ppt
1b_query_optimization_sil_7ed_ch16.ppt1b_query_optimization_sil_7ed_ch16.ppt
1b_query_optimization_sil_7ed_ch16.ppt
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
User biglm
User biglmUser biglm
User biglm
 
ch14.ppt
ch14.pptch14.ppt
ch14.ppt
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notations
 
Adbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationAdbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimization
 
Bsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structureBsc cs ii dfs u-1 introduction to data structure
Bsc cs ii dfs u-1 introduction to data structure
 

Kürzlich hochgeladen

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Kürzlich hochgeladen (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Query Optimization - Brandon Latronica

  • 1. Query Optimization Succinctly: Making the execution of queries optimally fast Brandon Latronica - 2017
  • 2. The pathway of a database command:
  • 3. The pathway of a database command: Query? Just a request information from a database. We can use a language to do it...name the most famous for a DBMS...SQL!
  • 4. The pathway of a database command: Parsing and translation: translate the query into its internal form. This is then translated into relational algebra. Parser checks syntax, verifies relations
  • 5. The pathway of a database command: Relational Algebra: The conversion of query syntax (SQL, etc) into some type of internal, DBMS relational algebra. Why? CAS systems are easier for computers, while words are better for humans!
  • 6. The pathway of a database command: Optimization: Last stop. The best plan is determined and then is pushed for execution via database calls which results in a query output.
  • 7. Basic Overview of Query Optimization ● Cost difference between evaluation plans for a query can be huge! ● Sometimes seconds vs. days! ● Steps in cost-based query optimization 1. Generate logically equivalent expressions using equivalence rules (Car travel paths) 2. Annotate resultant expressions to get alternative query plans 3. Choose the cheapest plan based on estimated cost ● Estimation of plan cost based on: ● Statistical information about relations. Eg: number of tuples, number of distinct values for an attribute,etc. ● Statistics estimation for intermediate results to compute cost of complex expressions ● Cost formulae for algorithms, computed using statistics
  • 8. Seconds vs days? Query Optimization - Yannis E. Ioannidis
  • 10. Algebra on real numbers? We know that. And so on... (x2 + 2x - 8) / (x - 2) = ? = (x - 2)(x + 4) / (x - 2) = 1 * (x + 4) = (x + 4) (8,6) : (2,1) Many terms and operations, to few.
  • 11. Boolean Algebra? We know that. AB V ( BC(B V C) ) = ? = AB V BBC V BCC [Distrib] = AB V BC V BC [Idem] = AB V BC [- Distrib] = B(A V C) (6,5) : (3,2) Many terms and operations, to few.
  • 12. σ : Select operator that selects specific filters requirements for information -- Ex: σname = "Dan" (customer) will select all in customer whose name that match "Dan". Π : Project operator that displays all information from a specified area areas -- Ex: Πname, balance(customer) will show all names and balances from customer In a relation table, a PROJECT eliminates columns while SELECT eliminates rows! Natural join (⋈ ) is a binary operator that is written as (R S) where R and S are⋈ relations. The result of the natural join is the set of all combinations of tuples(that is, ordered lists) in R and S that are equal on their common attribute names. Theta join ( ⋈ θ ) is an operation that consists of all combinations of tuples in R and S that satisfy θ. The result of the θ-join is defined only if the headers of S and R are disjoint, that is, do not contain a common attribute. Relational Algebra in a DB; some operators.
  • 13. Algebra Transformations Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every legal database instance ● Note: order of tuples is irrelevant ● we don’t care if they generate different results on databases that violate integrity constraints An equivalence rule says that expressions of two forms are equivalent ● Can replace expression of first form by second, or vice versa
  • 14. Equivalence rules - Relational Algebra 1. Conjunctive selection operations can be deconstructed into a sequence of individual selections. 2. Selection operations are commutative. 3. Only the last in a sequence of projection operations is needed, the others can be omitted. 4. Selections can be combined with Cartesian products and theta joins. a. σθ(E1 X E2) = E1 θ E2 b. σθ1(E1 θ2 E2) = E1 θ1 θ2∧ E2
  • 15. Equivalence rules - more. 5. Theta-join operations (and natural joins) are commutative. E1 θ E2 = E2 θ E1 6. (a) Natural join operations are associative: (E1 E2) E3 = E1 (E2 E3) (b) Theta joins are associative in the following manner: (E1 θ1 E2) θ2 θ∧ 3 E3 = E1 θ1 θ∧ 3 (E2 θ2 E3) where θ2 involves attributes from only E2 and E3. And many more...
  • 16. Pictorial Reduction Example Πname, title(σdept_name= “Music”∧year = 2009 (instructor (teaches Πcourse_id, title (course))))
  • 17. Good ways to order the joins? ● For all relations r1, r2, and r3, (r1 r2) r3 = r1 (r2 r3 ) (Join Associativity) ● If r2 r3 is quite large and r1 r2 is small, we choose (r1 r2) r3 so that we compute and store a smaller temporary relation. Good rule to always avoid Cartesian products when searching for an optimal plan.
  • 18. Counting alternative plans: How many? ● Query optimizers use equivalence rules to systematically generate expressions equivalent to the given expression ● Can generate all equivalent expressions as follows: ● Repeat until no new equivalent expressions are generated: *Apply all applicable equivalence rules on every subexpression of every equivalent expression which is found. *Add newly generated expressions to the set of equivalent expressions ● The above approach is expensive in terms of memory and compute time ● Two approaches: -Optimized plan generation based on transformation rules. -Special case approach for queries with only selections, projections and joins.
  • 19. ● Consider finding the best join-order for r1 r2 . . . rn. ● There are (bushy tree) (2(n – 1))!/(n – 1)! different join orders for above expression. With n = 7, the number is 665280, with n = 10, the number is greater than 176 billion! ● No need to generate all the join orders. Using dynamic programming, the least-cost join order for any subset of {r1, r2, . . . rn} is computed only once and stored for future use. Dynamic programming? A method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions – ideally, using a memory-based data structure. The next time the same subproblem occurs, instead of recomputing its solution, one simply looks up the previously computed solution. Practical query optimizers incorporate elements of the following two broad approaches: 1. Search all the plans and choose the best plan in a cost-based fashion. Use dynamic programing to store and recall past found optimal plans and subplans! 2. Uses heuristics to choose a plan. Cost and choice.
  • 20. Heuristics: Being Pragmatic ● Cost-based optimization is expensive, even with dynamic programming. ● Systems may use heuristics to reduce the number of choices that must be made in a cost- based fashion. ● Heuristic optimization transforms the query-tree by using a set of rules that typically (but not in all cases) improve execution performance: ● Perform selection early (reduces the number of tuples) ● Perform projection early (reduces the number of attributes) ● Perform most restrictive selection and join operations (i.e. with smallest result size) before other similar operations. ● Some systems use only heuristics, others combine heuristics with partial cost-based optimization.
  • 21. Statistical estimation of data size? What if, on a particular database relation, we stored information regarding the content of the table?
  • 22. Database indices? A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. A B+ tree is an n-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves.[1] The root may be either a leaf or a node with two or more children. Hash table is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. O log(n) O (1)
  • 23. Other Indices and reductions Offset B+- Tree is a type of index used to park new and update data on table set outside the body of the main table. Offset table << Main Table! Used with columnar TBAT files Why not columnar DBs? TBAT are.