SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Query Processing System
QUERY
 Overview
 Measures of Query Cost
 Selection Operation
 Sorting
 Join Operation
 Other Operations
 Evaluation of Expressions
 Catalog Information for Cost Estimation
 Estimation of Statistics
 Transformation of Relational Expressions
 Dynamic Programming for Choosing Evaluation Plans
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Cont…
• Parsing and translation
– Translate the query into its internal form. This is then translated
into relational algebra.
– Parser checks syntax, verifies relations
• Evaluation
– The query-execution engine takes a query-evaluation plan,
executes that plan, and returns the answers to the query.
Query Optimization
 Amongst all equivalent evaluation plans choose the one with
lowest cost.
 Cost is estimated using statistical information from the
database catalog e.g. number of tuples in each relation, size of tuples, etc.
 Cost is generally measured as total elapsed time for
answering query
 Number of seeks * average-seek-cost
 Number of blocks read * average-block-read-cost
 Number of blocks written * average-block-write-cost
Measures of Query Cost
 Costs depends on the size of the buffer in main memory
 Having more memory reduces need for disk access
 Amount of real memory available to buffer depends on other
concurrent OS processes, and hard to determine ahead of actual
execution
 We often use worst case estimates, assuming only the minimum
amount of memory needed for the operation is available
 Real systems take CPU cost into account, differentiate
between sequential and random I/O, and take buffer size
into account
Selection Operation
 File scan – search algorithms that locate and retrieve records
that fulfill a selection condition.
 Algorithm A1 (linear search). Scan each file block and test all
records to see whether they satisfy the selection condition.
 A2 (binary search). Applicable if selection is an equality
comparison on the attribute on which file is ordered.
 Index scan – search algorithms that use an index
 selection condition must be on search-key of index.
Cont…
• A3 (primary index on candidate key, equality). Retrieve a
single record that satisfies the corresponding equality
condition
• A4 (primary index on nonkey, equality) Retrieve multiple
records.
• A5 (equality on search-key of secondary index).
• A6 (primary index, comparison). (Relation is sorted on A)
• A7 (secondary index, comparison).
Cont…
• Conjunction: σθ1∧ θ2∧. . . θn(r)
• A8 (conjunctive selection using one index).
• A9 (conjunctive selection using multiple-key index).
• A10 (conjunctive selection by intersection of identifiers).
• Disjunction:σθ1∨ θ2∨. . . θn(r).
• A11 (disjunctive selection by union of identifiers).
• Negation: σ¬θ(r)
Sorting
 We may build an index on the relation, and then use the index
to read the relation in sorted order. May lead to one disk
block access for each tuple.
 For relations that fit in memory, techniques like quicksort can
be used. For relations that don’t fit in memory, external
sort-merge is a good choice.
External Sorting Using Sort-Merge
 Create sorted
runs.
 Merge the runs
(N-way merge).
Join Operation
Several different algorithms to implement
joins
 Nested-loop join
 Block nested-loop join
 Indexed nested-loop join
 Merge-join
 Hash-join
Nested-Loop Join
• To compute the theta join r θ s
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition θ
if they do, add tr • ts to the result.
end
end
• r is called the outer relation and s the inner relation of the join.
• Requires no indices and can be used with any kind of join condition.
• Expensive since it examines every pair of tuples in the two relations.
Block Nested-Loop Join
• Variant of nested-loop join in which every block of inner relation is paired
with every block of outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join condition
if they do, add tr• ts to the result.
end
end
end
end
Indexed Nested-Loop Join
 Index lookups can replace file scans if
 Join is an equi-join or natural join and
 An index is available on the inner relation’s join attribute
▪ Can construct an index just to compute a join
 For each tuple tr in the outer relation r, use the index to
look up tuples in s that satisfy the join condition with
tuple tr.
 Worst case: buffer has space for only one page of r, and,
for each tuple in r, we perform an index lookup on s.
Merge-Join
1. Sort both relations on their join
attribute (if not already sorted
on the join attributes).
2. Merge the sorted relations to
join them
1. Join step is similar to the merge
stage of the sort-merge algorithm.
2. Main difference is handling of
duplicate values in join attribute
— every pair with same value on
join attribute must be matched
3. Detailed algorithm in book
Hash-Join
 Applicable for equi-
joins and natural
joins.
 A hash function h is
used to partition
tuples of both
relations
Evaluation of Expressions
• Alternatives for evaluating an entire expression
tree
– Materialization: generate results of an expression
whose inputs are relations or are already computed,
materialize (store) it on disk. Repeat.
– Pipelining: pass on tuples to parent operations even
as an operation is being executed
Materialization
 Materialized evaluation: evaluate
one operation at a time, starting at
the lowest-level. Use intermediate
results materialized into temporary
relations to evaluate next-level
operations.
 E.g., in figure below, compute and
store
then compute the store its join with
customer, and finally compute the
projections on customer-name.
)(2500 accountbalance<σ
Pipelining
 Pipelined evaluation : evaluate several operations
simultaneously, passing the results of one operation on to the
next.
 Much cheaper than materialization: no need to store a
temporary relation to disk.
 Pipelining may not always be possible – e.g., sort, hash-join.
 Pipelines can be executed in two ways: demand driven and
producer driven
Demand driven or lazy evaluation
• System repeatedly requests next tuple from top level
operation
• Each operation requests next tuple from children
operations as required, in order to output its next tuple
• In between calls, operation has to maintain “state” so it
knows what to return next
• Each operation is implemented as an iterator
implementing the following operations
Cont…
 open()
▪ E.g. file scan: initialize file scan, store pointer to beginning of file
as state
▪ E.g.merge join: sort relations and store pointers to beginning of
sorted relations as state
 next()
▪ E.g. for file scan: Output next tuple, and advance and store file
pointer
▪ E.g. for merge join: continue with merge from earlier state till
next output tuple is found. Save pointers as iterator state.
 close()
Evaluation Plan
• An evaluation plan defines exactly what algorithm is used for each
operation, and how the execution of the operations is coordinated.
Transformation of Relational
Expressions
Pictorial Depiction of Equivalence
Rules
Cont…
Cont…
Cont…
Cont…
Heuristic Optimization
• Cost-based optimization is expensive, even with
dynamic programming.
• Systems may use heuristics to reduce the number of
choices that must be made in a cost-based fashion.
• Heuristic optimization transforms the query-tree by
using a set of rules that typically (but not in all cases)
improve execution performance
Cont…
 Perform selection early (reduces the number of tuples)
 Perform projection early (reduces the number of
attributes)
 Perform most restrictive selection and join operations
before other similar operations.
 Some systems use only heuristics, others combine
heuristics with partial cost-based optimization.
Steps in Typical Heuristic Optimization
1. Deconstruct conjunctive selections into a sequence of single
selection operations (Equiv. rule 1.).
2. Move selection operations down the query tree for the
earliest possible execution (Equiv. rules 2, 7a, 7b, 11).
3. Execute first those selection and join operations that will
produce the smallest relations (Equiv. rule 6).
Cont…
4. Replace Cartesian product operations that are followed by a
selection condition by join operations (Equiv. rule 4a).
5. Deconstruct and move as far down the tree as possible lists of
projection attributes, creating new projections where needed (Equiv.
rules 3, 8a, 8b, 12).
6. Identify those subtrees whose operations can be pipelined, and
execute them using pipelining).

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
Query processing and Query Optimization
Query processing and Query OptimizationQuery processing and Query Optimization
Query processing and Query Optimization
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
 
Query compiler
Query compilerQuery compiler
Query compiler
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Query-porcessing-& Query optimization
Query-porcessing-& Query optimizationQuery-porcessing-& Query optimization
Query-porcessing-& Query optimization
 
Query optimisation
Query optimisationQuery optimisation
Query optimisation
 
14. Query Optimization in DBMS
14. Query Optimization in DBMS14. Query Optimization in DBMS
14. Query Optimization in DBMS
 
Heuristic approch monika sanghani
Heuristic approch  monika sanghaniHeuristic approch  monika sanghani
Heuristic approch monika sanghani
 
Ch13
Ch13Ch13
Ch13
 
U nit i data structure-converted
U nit   i data structure-convertedU nit   i data structure-converted
U nit i data structure-converted
 
Unit ii data structure-converted
Unit  ii data structure-convertedUnit  ii data structure-converted
Unit ii data structure-converted
 
Algorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structureAlgorithm analysis in fundamentals of data structure
Algorithm analysis in fundamentals of data structure
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
ch13
ch13ch13
ch13
 
ADS Introduction
ADS IntroductionADS Introduction
ADS Introduction
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica
 
Query processing
Query processingQuery processing
Query processing
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 

Ähnlich wie Query processing System

RDBMS
RDBMSRDBMS
RDBMSsowfi
 
Implementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeImplementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeAlexander Decker
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Query Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionQuery Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionPrabu U
 
Algorithm ch13.ppt
Algorithm ch13.pptAlgorithm ch13.ppt
Algorithm ch13.pptDreamless2
 
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...Spark Summit
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publishGleydson Lima
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxSyed Zaid Irshad
 
Korth_Query_processing.pptx
Korth_Query_processing.pptxKorth_Query_processing.pptx
Korth_Query_processing.pptxDrSonuMittal
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkSaurav Jha
 
Intro to Data Structure & Algorithms
Intro to Data Structure & AlgorithmsIntro to Data Structure & Algorithms
Intro to Data Structure & AlgorithmsAkhil Kaushik
 

Ähnlich wie Query processing System (20)

Rdbms
RdbmsRdbms
Rdbms
 
28890 lecture10
28890 lecture1028890 lecture10
28890 lecture10
 
RDBMS
RDBMSRDBMS
RDBMS
 
Implementation of query optimization for reducing run time
Implementation of query optimization for reducing run timeImplementation of query optimization for reducing run time
Implementation of query optimization for reducing run time
 
Join operation
Join operationJoin operation
Join operation
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Query Processing, Query Optimization and Transaction
Query Processing, Query Optimization and TransactionQuery Processing, Query Optimization and Transaction
Query Processing, Query Optimization and Transaction
 
Algorithm ch13.ppt
Algorithm ch13.pptAlgorithm ch13.ppt
Algorithm ch13.ppt
 
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
Large-Scale Text Processing Pipeline with Spark ML and GraphFrames: Spark Sum...
 
Data structure
Data structureData structure
Data structure
 
UNIT 1.pptx
UNIT 1.pptxUNIT 1.pptx
UNIT 1.pptx
 
Hadoop map reduce concepts
Hadoop map reduce conceptsHadoop map reduce concepts
Hadoop map reduce concepts
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publish
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptx
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
Chap5 slides
Chap5 slidesChap5 slides
Chap5 slides
 
Korth_Query_processing.pptx
Korth_Query_processing.pptxKorth_Query_processing.pptx
Korth_Query_processing.pptx
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
 
Intro to Data Structure & Algorithms
Intro to Data Structure & AlgorithmsIntro to Data Structure & Algorithms
Intro to Data Structure & Algorithms
 
Sorting algorithm
Sorting algorithmSorting algorithm
Sorting algorithm
 

Kürzlich hochgeladen

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 

Kürzlich hochgeladen (20)

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 

Query processing System

  • 2. QUERY  Overview  Measures of Query Cost  Selection Operation  Sorting  Join Operation  Other Operations  Evaluation of Expressions  Catalog Information for Cost Estimation  Estimation of Statistics  Transformation of Relational Expressions  Dynamic Programming for Choosing Evaluation Plans
  • 3. Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation
  • 4. Cont… • Parsing and translation – Translate the query into its internal form. This is then translated into relational algebra. – Parser checks syntax, verifies relations • Evaluation – The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.
  • 5. Query Optimization  Amongst all equivalent evaluation plans choose the one with lowest cost.  Cost is estimated using statistical information from the database catalog e.g. number of tuples in each relation, size of tuples, etc.  Cost is generally measured as total elapsed time for answering query  Number of seeks * average-seek-cost  Number of blocks read * average-block-read-cost  Number of blocks written * average-block-write-cost
  • 6. Measures of Query Cost  Costs depends on the size of the buffer in main memory  Having more memory reduces need for disk access  Amount of real memory available to buffer depends on other concurrent OS processes, and hard to determine ahead of actual execution  We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available  Real systems take CPU cost into account, differentiate between sequential and random I/O, and take buffer size into account
  • 7. Selection Operation  File scan – search algorithms that locate and retrieve records that fulfill a selection condition.  Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition.  A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered.  Index scan – search algorithms that use an index  selection condition must be on search-key of index.
  • 8. Cont… • A3 (primary index on candidate key, equality). Retrieve a single record that satisfies the corresponding equality condition • A4 (primary index on nonkey, equality) Retrieve multiple records. • A5 (equality on search-key of secondary index). • A6 (primary index, comparison). (Relation is sorted on A) • A7 (secondary index, comparison).
  • 9. Cont… • Conjunction: σθ1∧ θ2∧. . . θn(r) • A8 (conjunctive selection using one index). • A9 (conjunctive selection using multiple-key index). • A10 (conjunctive selection by intersection of identifiers). • Disjunction:σθ1∨ θ2∨. . . θn(r). • A11 (disjunctive selection by union of identifiers). • Negation: σ¬θ(r)
  • 10. Sorting  We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each tuple.  For relations that fit in memory, techniques like quicksort can be used. For relations that don’t fit in memory, external sort-merge is a good choice.
  • 11. External Sorting Using Sort-Merge  Create sorted runs.  Merge the runs (N-way merge).
  • 12. Join Operation Several different algorithms to implement joins  Nested-loop join  Block nested-loop join  Indexed nested-loop join  Merge-join  Hash-join
  • 13. Nested-Loop Join • To compute the theta join r θ s for each tuple tr in r do begin for each tuple ts in s do begin test pair (tr,ts) to see if they satisfy the join condition θ if they do, add tr • ts to the result. end end • r is called the outer relation and s the inner relation of the join. • Requires no indices and can be used with any kind of join condition. • Expensive since it examines every pair of tuples in the two relations.
  • 14. Block Nested-Loop Join • Variant of nested-loop join in which every block of inner relation is paired with every block of outer relation. for each block Br of r do begin for each block Bs of s do begin for each tuple tr in Br do begin for each tuple ts in Bs do begin Check if (tr,ts) satisfy the join condition if they do, add tr• ts to the result. end end end end
  • 15. Indexed Nested-Loop Join  Index lookups can replace file scans if  Join is an equi-join or natural join and  An index is available on the inner relation’s join attribute ▪ Can construct an index just to compute a join  For each tuple tr in the outer relation r, use the index to look up tuples in s that satisfy the join condition with tuple tr.  Worst case: buffer has space for only one page of r, and, for each tuple in r, we perform an index lookup on s.
  • 16. Merge-Join 1. Sort both relations on their join attribute (if not already sorted on the join attributes). 2. Merge the sorted relations to join them 1. Join step is similar to the merge stage of the sort-merge algorithm. 2. Main difference is handling of duplicate values in join attribute — every pair with same value on join attribute must be matched 3. Detailed algorithm in book
  • 17. Hash-Join  Applicable for equi- joins and natural joins.  A hash function h is used to partition tuples of both relations
  • 18. Evaluation of Expressions • Alternatives for evaluating an entire expression tree – Materialization: generate results of an expression whose inputs are relations or are already computed, materialize (store) it on disk. Repeat. – Pipelining: pass on tuples to parent operations even as an operation is being executed
  • 19. Materialization  Materialized evaluation: evaluate one operation at a time, starting at the lowest-level. Use intermediate results materialized into temporary relations to evaluate next-level operations.  E.g., in figure below, compute and store then compute the store its join with customer, and finally compute the projections on customer-name. )(2500 accountbalance<σ
  • 20. Pipelining  Pipelined evaluation : evaluate several operations simultaneously, passing the results of one operation on to the next.  Much cheaper than materialization: no need to store a temporary relation to disk.  Pipelining may not always be possible – e.g., sort, hash-join.  Pipelines can be executed in two ways: demand driven and producer driven
  • 21. Demand driven or lazy evaluation • System repeatedly requests next tuple from top level operation • Each operation requests next tuple from children operations as required, in order to output its next tuple • In between calls, operation has to maintain “state” so it knows what to return next • Each operation is implemented as an iterator implementing the following operations
  • 22. Cont…  open() ▪ E.g. file scan: initialize file scan, store pointer to beginning of file as state ▪ E.g.merge join: sort relations and store pointers to beginning of sorted relations as state  next() ▪ E.g. for file scan: Output next tuple, and advance and store file pointer ▪ E.g. for merge join: continue with merge from earlier state till next output tuple is found. Save pointers as iterator state.  close()
  • 23. Evaluation Plan • An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.
  • 25. Pictorial Depiction of Equivalence Rules
  • 27.
  • 31. Heuristic Optimization • Cost-based optimization is expensive, even with dynamic programming. • Systems may use heuristics to reduce the number of choices that must be made in a cost-based fashion. • Heuristic optimization transforms the query-tree by using a set of rules that typically (but not in all cases) improve execution performance
  • 32. Cont…  Perform selection early (reduces the number of tuples)  Perform projection early (reduces the number of attributes)  Perform most restrictive selection and join operations before other similar operations.  Some systems use only heuristics, others combine heuristics with partial cost-based optimization.
  • 33. Steps in Typical Heuristic Optimization 1. Deconstruct conjunctive selections into a sequence of single selection operations (Equiv. rule 1.). 2. Move selection operations down the query tree for the earliest possible execution (Equiv. rules 2, 7a, 7b, 11). 3. Execute first those selection and join operations that will produce the smallest relations (Equiv. rule 6).
  • 34. Cont… 4. Replace Cartesian product operations that are followed by a selection condition by join operations (Equiv. rule 4a). 5. Deconstruct and move as far down the tree as possible lists of projection attributes, creating new projections where needed (Equiv. rules 3, 8a, 8b, 12). 6. Identify those subtrees whose operations can be pipelined, and execute them using pipelining).

Hinweis der Redaktion

  1. Applicable for equi-joins and natural joins. A hash function h is used to partition tuples of both relations