SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Indian PUG 2015
11th April
Kumar Rajeev Rastogi
Prasanna
Optimizer Hint
KUMAR RAJEEV RASTOGI
 Senior Technical Leader at Huawei Technology for almost 7 years
 Have worked to develop various features on PostgreSQL (for internal
projects) as well as on In-House DB.
 Active PostgreSQL community members, have contributed many patches.
 Holds around 10 patents in my name in various DB technologies.
 This is my second talk in Indian PUG, first one was last year.
 Prior to this, worked at Aricent Technology for 3 years.
Blog - rajeevrastogi.blogspot.in
LinkedIn - http://in.linkedin.com/in/kumarrajeevrastogi
Who am I?
Introduction
Why Optimizer Hint
Query Hint
Statistics Hint
Data Hint
Drawback
Reference
Agenda
Introduction
The overall query execution is described below:
Introduction Contd…
1. Parser: Parses the query submitted by the client to do the
syntactical validation. Output of this step is “parsed tree”.
2. Analyzer: It validates the parsed tree semantically.
3. Utility Commands: All of the DDL and other utility commands then
gets executed by this sub-module.
4. Optimizer: This is almost like brain of complete SQL execution
engine. It find the best possible plan for its execution.
5. Executor: The output from optimizer is “Plan Tree”, which then gets
converted to Execution Tree, where each node of the tree denotes
the kind of operation it has to do like IndexScan, SeqScan, Agg,
Join etc. Then each node gets executed to yield the final result.
Optimizer Zoomed In
It takes the validated query tree from analyzer, looks for the all the
possible plan and then selects the best plan for execution. There are
multiple possible plan because there are several different methods of
doing the same operation, some of them are:
a. Two scan algorithms (index scan, sequential scan)
b. Three core join algorithms (nested loops, hash join, merge join)
c. Join Order
Factors Used to Decide Best Plan
The planner chooses
between plans based
on their estimated
cost
Some of the
parameters used to
decide cost are
sequential I/O cost,
random I/O cost and
CPU cost.
Estimated I/O
required is
calculated based on
the number of pages
to be scanned. CPU
cost is based on
estimated number of
records and
qualification.
Random IO is
(much) more
expensive than
sequential IO on
modern hardware
Optimizer Accuracy
It is believed that Optimizer makes best effort to produce the best
plan for execution using the parameters mentioned and actually in
majority of cases it is being accurate to give the correct and optimal
plan for execution.
“Why Optimizer Hint”?
Why Optimizer Hint
What is Optimizer Hint:
As the name suggest, it is a Hint to optimizer to change the
resultant plan as per the hint. As it is just hint, so resultant plan may
or may not be as per the hint given.
Why Optimizer Hint:
As a DBA you might know information about your data but the
optimizer may not know, in this case DBA should be able to instruct
the optimizer to choose a certain query execution plan based on
some criteria.
Why Optimizer Hint: Scenario-1
Consider for a query, the possible plans for joining two tables were
merge join and nested loop join but based on the total cost
estimation merge join were selected as winner plan. Now consider
scenario where we might require getting the first row result at the
earliest, then in that case the plan with smallest start-up cost will be
more useful although total cost for this plan is more.
Why Optimizer Hint: Scenario-2
Consider the following query
SELECT * FROM TBL1, TBL3, TBL2 WHERE ….
Now lets us say the selectivity for each tables are as below:
TBL1: 0.5
TBL2: 0.5
TBL3: 0.3
Since currently planner does not maintain the selectivity for join
tables or columns, then in order to check selectivity for join tables, it
simply multiply the selectivity of individual tables. So now in our
example case, selectivity will be as below:
TBL1, TBL2: 0.5*0.5 = 0.25
TBL1, TBL3: 0.5*0.3 = 0.15
But DBA knows that the selectivity of join of TBL1 AND TBL2 is not
correct as it will result in maximum of one record where as selectivity
of join of TBL1 and TBL3 is correct.
Hint Types
1. Query Hints
2. Statistics Hints
3. Data Hints
Query Hint
This kind of hints force optimizer to choose desired plan on specific
relation or join of relations. E.g. DBA can specify to select
 Sequence scan path for scanning a relation.
 Index scan path for scanning a relation.
 Merge join path for joining two relations.
 Order to evaluate join of relations.
Query Hint Contd…
Following are ways to provide query hint for a particular query
 Hints along with query in a commented format with syntax as:
SELECT/UPDATE/DELETE/INSERT /*hint*/ ………………;
 Some DB provide it as separate command instead of embedding
into query.
 Sometime it is also given as property along with table name in
query.
Query Hint By Other DBs
Many databases are using query hint as optimization hint, some of
them are:
1. Oracle
2. Sybase
3. MySQL
4. EnterpriseDB PostgreSQL Plus
5. Recently we have also implemented query hint in Huawei for
internal Database.
Query Hint By Oracle: Example
Some of the hint used by Oracle in query are:
1. /*+ FULL(tbl)*/ Hint to optimizer to choose sequence scan for table
„tbl‟;
2. /*+ORDERED */ Hint to optimizer to choose the same join ordering as
the table names are given in FROM clause.
3. /*+ USE_NL(tbl1 tbl2) */ Hint to optimizer to choose the nested loop
join between tables ‘tbl1’ and ‘tbl2’;
Query Hint By Sybase: Example
Some of the hint used by Sybase in query are:
1. set forceplan [on|off] Hint to optimizer to choose the same join
ordering as the table names are given in FROM clause, if it is on.
2. We can specify the index to use for a query using the (index
index_name) clause in select, update, and delete statements
Query Hint By MySQL: Example
Some of the hint used by MySQL in query are:
1. /*! STRAIGHT_JOIN */ This hint tells optimizer to join the tables in the
order that they are specified in the FROM clause. (MySQL hint is similar
to oracle except it uses ‘!’ instead of ‘+’.
2. USE {INDEX|KEY} (index_list)] Provide hints to give the optimizer
information about how to choose indexes during query processing
Query Hint By EDB: Example
Hint used in EDB are similar to used in Oracle:
1. /*+ FULL(tbl)*/ Hint to optimizer to choose sequence scan for table
„tbl‟;
2. /*+ORDERED */ Hint to optimizer to choose the same join ordering
as the table names are given in FROM clause.
3. /*+ USE_NL(tbl1 tbl2) */ Hint to optimizer to choose the nested
loop join between tables „tbl1‟ and „tbl2‟;
Does PostgreSQL has query Hint?
NO (or indirectly YES)
PostgreSQL does not support hint directly but there are many GUC
configuration parameter, using which we can force to disable particular
plan (Notice here that we can configure to disable a particular plan
unlike other DB, where Hint provided to choose a particular plan).
e.g.
enable_indexscan to off: Forces optimizer to skip index scan
enable_mergejoin to off: Forces optimizer to skip merge join
Also this setting is session-wise not query-wise.
Similar to hint, this setting can also be ignored if not possible to process
e.g. even if we make enable_seqscan to off still it can select sequence
scan to scan a table if there is no index on that particular table.
These setting are very useful for a developer or DBA working on
tuning the planner or their application respectively.
Statistics Hint
Statistics Hint is used to provide any kind of possible statistics
related to query, which can be used by optimizer to yield the even
better plan compare to what it would have done otherwise.
Since most of the databases stores statistics for a particular column
or relation but doesn‟t store statistics related to join of column or
relation. Rather these databases just multiply the statistics of
individual column/relation to get the statistics of join, which may not
be always correct.
So for such case statistics based hints can be used to
provides direct statistics of join of relation or columns.
Statistics Hint: Example
Lets say there is query as
SELECT * FROM EMPLOYEE WHERE GRADE>5 AND
SALARY > 10000;
If we calculate independent stats for a and b.
suppose sel(GRADE) = .2 and sel(SALARY) = .2;
then sel (GRADE and SALARY) =
sel(GRADE) * sel (SALARY) = .04.
In all practical cases if we see, these two components will be highly
be dependent i.e. first column satisfy, 2nd column will also satisfy.
Then in that case sel (GRADE and SALARY) should be .2 not .04. But
current optimizer will be incorrect in this case and may give wrong
plan.
Statistics Hint: Example Contd…
The query with statistics hint will look like:
SELECT /*+ SEL (GRADE and SALARY) AS 0.2* FROM
EMPLOYEE WHERE GRADE>5 AND SALARY > 10000;
Data Hint
This kind of hints provides the information about the relationship/
dependency among relations or column to influence the plan instead
of directly hinting to provide desired plan or direct selectivity value.
Optimizer can consider dependency information to derive the
actual selectivity.
Data Hint: Example
Lets say there is a query as
SELECT * FROM TBL WHERE ID1 = 5 AND ID2=NULL;
SELECT * FROM TBL WHERE ID1 = 5 AND ID2!=NULL;
Now here if we specify that the dependency as
“If TBL.ID1 = 5 then TBL.ID2 is NULL”
then the optimizer will always consider this dependency pattern and
accordingly combined statistics for these two columns can be
choosen.
Drawback
1. Required periodic maintenance to verify that hint supplied is still
giving positive result
2. Interference with upgrades: today's helpful hints become anti-
performance after an upgrade.
3. Encouraging bad DBA habits slap a hint on instead of figuring out
the real issue.
Reference
 http://docs.oracle.com/cd/B19306_01/server.102/b14211/hintsref.h
tm
 http://manuals.sybase.com/onlinebooks/group-
as/asg1250e/ptallbk/@Generic__BookTextView/32117
Optimizer Hints
Optimizer Hints

Weitere ähnliche Inhalte

Was ist angesagt?

Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
PostgresOpen
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PostgresOpen
 
twp-integrating-hadoop-data-with-or-130063
twp-integrating-hadoop-data-with-or-130063twp-integrating-hadoop-data-with-or-130063
twp-integrating-hadoop-data-with-or-130063
Madhusudan Anand
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
Sergey Petrunya
 

Was ist angesagt? (20)

[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyond
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
 
twp-integrating-hadoop-data-with-or-130063
twp-integrating-hadoop-data-with-or-130063twp-integrating-hadoop-data-with-or-130063
twp-integrating-hadoop-data-with-or-130063
 
How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer Works
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
PgconfSV compression
PgconfSV compressionPgconfSV compression
PgconfSV compression
 
Postgresql Database Administration Basic - Day1
Postgresql  Database Administration Basic  - Day1Postgresql  Database Administration Basic  - Day1
Postgresql Database Administration Basic - Day1
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
 
Introduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB ClusterIntroduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB Cluster
 
Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7
 
MySQL database replication
MySQL database replicationMySQL database replication
MySQL database replication
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'roll
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guide
 

Andere mochten auch

Why Content Marketing Fails
Why Content Marketing FailsWhy Content Marketing Fails
Why Content Marketing Fails
Rand Fishkin
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
SlideShare
 

Andere mochten auch (20)

Attacking Web Proxies
Attacking Web ProxiesAttacking Web Proxies
Attacking Web Proxies
 
Introduction to cocoa sql mapper
Introduction to cocoa sql mapperIntroduction to cocoa sql mapper
Introduction to cocoa sql mapper
 
Building Machine Learning Pipelines
Building Machine Learning PipelinesBuilding Machine Learning Pipelines
Building Machine Learning Pipelines
 
Building Spark as Service in Cloud
Building Spark as Service in CloudBuilding Spark as Service in Cloud
Building Spark as Service in Cloud
 
Case Studies on PostgreSQL
Case Studies on PostgreSQLCase Studies on PostgreSQL
Case Studies on PostgreSQL
 
Cloud Computing (CCSME 2015 talk) - mypapit
Cloud Computing (CCSME 2015 talk) - mypapitCloud Computing (CCSME 2015 talk) - mypapit
Cloud Computing (CCSME 2015 talk) - mypapit
 
8 Ways a Digital Media Platform is More Powerful than “Marketing”
8 Ways a Digital Media Platform is More Powerful than “Marketing”8 Ways a Digital Media Platform is More Powerful than “Marketing”
8 Ways a Digital Media Platform is More Powerful than “Marketing”
 
How Often Should You Post to Facebook and Twitter
How Often Should You Post to Facebook and TwitterHow Often Should You Post to Facebook and Twitter
How Often Should You Post to Facebook and Twitter
 
Slides That Rock
Slides That RockSlides That Rock
Slides That Rock
 
Why Content Marketing Fails
Why Content Marketing FailsWhy Content Marketing Fails
Why Content Marketing Fails
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great Infographics
 
Masters of SlideShare
Masters of SlideShareMasters of SlideShare
Masters of SlideShare
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareSTOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
 
You Suck At PowerPoint!
You Suck At PowerPoint!You Suck At PowerPoint!
You Suck At PowerPoint!
 
10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization
 
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content MarketingHow To Get More From SlideShare - Super-Simple Tips For Content Marketing
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 

Ähnlich wie Optimizer Hints

Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
paulguerin
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
paulguerin
 
Sydney Oracle Meetup - indexes
Sydney Oracle Meetup - indexesSydney Oracle Meetup - indexes
Sydney Oracle Meetup - indexes
paulguerin
 
127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections
Amit Sharma
 

Ähnlich wie Optimizer Hints (20)

Processes in Query Optimization in (ABMS) Advanced Database Management Systems
Processes in Query Optimization in (ABMS) Advanced Database Management Systems Processes in Query Optimization in (ABMS) Advanced Database Management Systems
Processes in Query Optimization in (ABMS) Advanced Database Management Systems
 
Teradata sql-tuning-top-10
Teradata sql-tuning-top-10Teradata sql-tuning-top-10
Teradata sql-tuning-top-10
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Twp optimizer-with-oracledb-12c-1963236
Twp optimizer-with-oracledb-12c-1963236Twp optimizer-with-oracledb-12c-1963236
Twp optimizer-with-oracledb-12c-1963236
 
Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
 
Oracle-Whitepaper-Optimizer-with-Oracle-Database-12c.pdf
Oracle-Whitepaper-Optimizer-with-Oracle-Database-12c.pdfOracle-Whitepaper-Optimizer-with-Oracle-Database-12c.pdf
Oracle-Whitepaper-Optimizer-with-Oracle-Database-12c.pdf
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
SQL Tunning
SQL TunningSQL Tunning
SQL Tunning
 
Part4 Influencing Execution Plans with Optimizer Hints
Part4 Influencing Execution Plans with Optimizer HintsPart4 Influencing Execution Plans with Optimizer Hints
Part4 Influencing Execution Plans with Optimizer Hints
 
Application sql issues_and_tuning
Application sql issues_and_tuningApplication sql issues_and_tuning
Application sql issues_and_tuning
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Instant DBMS Homework Help
Instant DBMS Homework HelpInstant DBMS Homework Help
Instant DBMS Homework Help
 
Chapter16
Chapter16Chapter16
Chapter16
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
Dbms important questions and answers
Dbms important questions and answersDbms important questions and answers
Dbms important questions and answers
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Sydney Oracle Meetup - indexes
Sydney Oracle Meetup - indexesSydney Oracle Meetup - indexes
Sydney Oracle Meetup - indexes
 
127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections
 
Cost Based Optimizer - Part 1 of 2
Cost Based Optimizer - Part 1 of 2Cost Based Optimizer - Part 1 of 2
Cost Based Optimizer - Part 1 of 2
 

Mehr von InMobi Technology

Mehr von InMobi Technology (20)

Ensemble Methods for Algorithmic Trading
Ensemble Methods for Algorithmic TradingEnsemble Methods for Algorithmic Trading
Ensemble Methods for Algorithmic Trading
 
Backbone & Graphs
Backbone & GraphsBackbone & Graphs
Backbone & Graphs
 
24/7 Monitoring and Alerting of PostgreSQL
24/7 Monitoring and Alerting of PostgreSQL24/7 Monitoring and Alerting of PostgreSQL
24/7 Monitoring and Alerting of PostgreSQL
 
Reflective and Stored XSS- Cross Site Scripting
Reflective and Stored XSS- Cross Site ScriptingReflective and Stored XSS- Cross Site Scripting
Reflective and Stored XSS- Cross Site Scripting
 
Introduction to Threat Modeling
Introduction to Threat ModelingIntroduction to Threat Modeling
Introduction to Threat Modeling
 
HTTP Basics Demo
HTTP Basics DemoHTTP Basics Demo
HTTP Basics Demo
 
The Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataThe Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big Data
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014
 
Security News Bytes Null Dec Meet Bangalore
Security News Bytes Null Dec Meet BangaloreSecurity News Bytes Null Dec Meet Bangalore
Security News Bytes Null Dec Meet Bangalore
 
Matriux blue
Matriux blueMatriux blue
Matriux blue
 
PCI DSS v3 - Protecting Cardholder data
PCI DSS v3 - Protecting Cardholder dataPCI DSS v3 - Protecting Cardholder data
PCI DSS v3 - Protecting Cardholder data
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
 
Shodan- That Device Search Engine
Shodan- That Device Search EngineShodan- That Device Search Engine
Shodan- That Device Search Engine
 
Big Data BI Simplified
Big Data BI SimplifiedBig Data BI Simplified
Big Data BI Simplified
 
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQMassively Parallel Processing with Procedural Python - Pivotal HAWQ
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
 
Building Audience Analytics Platform
Building Audience Analytics PlatformBuilding Audience Analytics Platform
Building Audience Analytics Platform
 
Big Data and User Segmentation in Mobile Context
Big Data and User Segmentation in Mobile ContextBig Data and User Segmentation in Mobile Context
Big Data and User Segmentation in Mobile Context
 
Freedom Hack Report 2014
Freedom Hack Report 2014Freedom Hack Report 2014
Freedom Hack Report 2014
 
Hadoop fundamentals
Hadoop fundamentalsHadoop fundamentals
Hadoop fundamentals
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Optimizer Hints

  • 1. Indian PUG 2015 11th April Kumar Rajeev Rastogi Prasanna Optimizer Hint
  • 2. KUMAR RAJEEV RASTOGI  Senior Technical Leader at Huawei Technology for almost 7 years  Have worked to develop various features on PostgreSQL (for internal projects) as well as on In-House DB.  Active PostgreSQL community members, have contributed many patches.  Holds around 10 patents in my name in various DB technologies.  This is my second talk in Indian PUG, first one was last year.  Prior to this, worked at Aricent Technology for 3 years. Blog - rajeevrastogi.blogspot.in LinkedIn - http://in.linkedin.com/in/kumarrajeevrastogi Who am I?
  • 3. Introduction Why Optimizer Hint Query Hint Statistics Hint Data Hint Drawback Reference Agenda
  • 4. Introduction The overall query execution is described below:
  • 5. Introduction Contd… 1. Parser: Parses the query submitted by the client to do the syntactical validation. Output of this step is “parsed tree”. 2. Analyzer: It validates the parsed tree semantically. 3. Utility Commands: All of the DDL and other utility commands then gets executed by this sub-module. 4. Optimizer: This is almost like brain of complete SQL execution engine. It find the best possible plan for its execution. 5. Executor: The output from optimizer is “Plan Tree”, which then gets converted to Execution Tree, where each node of the tree denotes the kind of operation it has to do like IndexScan, SeqScan, Agg, Join etc. Then each node gets executed to yield the final result.
  • 6. Optimizer Zoomed In It takes the validated query tree from analyzer, looks for the all the possible plan and then selects the best plan for execution. There are multiple possible plan because there are several different methods of doing the same operation, some of them are: a. Two scan algorithms (index scan, sequential scan) b. Three core join algorithms (nested loops, hash join, merge join) c. Join Order
  • 7. Factors Used to Decide Best Plan The planner chooses between plans based on their estimated cost Some of the parameters used to decide cost are sequential I/O cost, random I/O cost and CPU cost. Estimated I/O required is calculated based on the number of pages to be scanned. CPU cost is based on estimated number of records and qualification. Random IO is (much) more expensive than sequential IO on modern hardware
  • 8. Optimizer Accuracy It is believed that Optimizer makes best effort to produce the best plan for execution using the parameters mentioned and actually in majority of cases it is being accurate to give the correct and optimal plan for execution. “Why Optimizer Hint”?
  • 9. Why Optimizer Hint What is Optimizer Hint: As the name suggest, it is a Hint to optimizer to change the resultant plan as per the hint. As it is just hint, so resultant plan may or may not be as per the hint given. Why Optimizer Hint: As a DBA you might know information about your data but the optimizer may not know, in this case DBA should be able to instruct the optimizer to choose a certain query execution plan based on some criteria.
  • 10. Why Optimizer Hint: Scenario-1 Consider for a query, the possible plans for joining two tables were merge join and nested loop join but based on the total cost estimation merge join were selected as winner plan. Now consider scenario where we might require getting the first row result at the earliest, then in that case the plan with smallest start-up cost will be more useful although total cost for this plan is more.
  • 11. Why Optimizer Hint: Scenario-2 Consider the following query SELECT * FROM TBL1, TBL3, TBL2 WHERE …. Now lets us say the selectivity for each tables are as below: TBL1: 0.5 TBL2: 0.5 TBL3: 0.3 Since currently planner does not maintain the selectivity for join tables or columns, then in order to check selectivity for join tables, it simply multiply the selectivity of individual tables. So now in our example case, selectivity will be as below: TBL1, TBL2: 0.5*0.5 = 0.25 TBL1, TBL3: 0.5*0.3 = 0.15 But DBA knows that the selectivity of join of TBL1 AND TBL2 is not correct as it will result in maximum of one record where as selectivity of join of TBL1 and TBL3 is correct.
  • 12. Hint Types 1. Query Hints 2. Statistics Hints 3. Data Hints
  • 13. Query Hint This kind of hints force optimizer to choose desired plan on specific relation or join of relations. E.g. DBA can specify to select  Sequence scan path for scanning a relation.  Index scan path for scanning a relation.  Merge join path for joining two relations.  Order to evaluate join of relations.
  • 14. Query Hint Contd… Following are ways to provide query hint for a particular query  Hints along with query in a commented format with syntax as: SELECT/UPDATE/DELETE/INSERT /*hint*/ ………………;  Some DB provide it as separate command instead of embedding into query.  Sometime it is also given as property along with table name in query.
  • 15. Query Hint By Other DBs Many databases are using query hint as optimization hint, some of them are: 1. Oracle 2. Sybase 3. MySQL 4. EnterpriseDB PostgreSQL Plus 5. Recently we have also implemented query hint in Huawei for internal Database.
  • 16. Query Hint By Oracle: Example Some of the hint used by Oracle in query are: 1. /*+ FULL(tbl)*/ Hint to optimizer to choose sequence scan for table „tbl‟; 2. /*+ORDERED */ Hint to optimizer to choose the same join ordering as the table names are given in FROM clause. 3. /*+ USE_NL(tbl1 tbl2) */ Hint to optimizer to choose the nested loop join between tables ‘tbl1’ and ‘tbl2’;
  • 17. Query Hint By Sybase: Example Some of the hint used by Sybase in query are: 1. set forceplan [on|off] Hint to optimizer to choose the same join ordering as the table names are given in FROM clause, if it is on. 2. We can specify the index to use for a query using the (index index_name) clause in select, update, and delete statements
  • 18. Query Hint By MySQL: Example Some of the hint used by MySQL in query are: 1. /*! STRAIGHT_JOIN */ This hint tells optimizer to join the tables in the order that they are specified in the FROM clause. (MySQL hint is similar to oracle except it uses ‘!’ instead of ‘+’. 2. USE {INDEX|KEY} (index_list)] Provide hints to give the optimizer information about how to choose indexes during query processing
  • 19. Query Hint By EDB: Example Hint used in EDB are similar to used in Oracle: 1. /*+ FULL(tbl)*/ Hint to optimizer to choose sequence scan for table „tbl‟; 2. /*+ORDERED */ Hint to optimizer to choose the same join ordering as the table names are given in FROM clause. 3. /*+ USE_NL(tbl1 tbl2) */ Hint to optimizer to choose the nested loop join between tables „tbl1‟ and „tbl2‟;
  • 20. Does PostgreSQL has query Hint? NO (or indirectly YES) PostgreSQL does not support hint directly but there are many GUC configuration parameter, using which we can force to disable particular plan (Notice here that we can configure to disable a particular plan unlike other DB, where Hint provided to choose a particular plan). e.g. enable_indexscan to off: Forces optimizer to skip index scan enable_mergejoin to off: Forces optimizer to skip merge join Also this setting is session-wise not query-wise. Similar to hint, this setting can also be ignored if not possible to process e.g. even if we make enable_seqscan to off still it can select sequence scan to scan a table if there is no index on that particular table. These setting are very useful for a developer or DBA working on tuning the planner or their application respectively.
  • 21. Statistics Hint Statistics Hint is used to provide any kind of possible statistics related to query, which can be used by optimizer to yield the even better plan compare to what it would have done otherwise. Since most of the databases stores statistics for a particular column or relation but doesn‟t store statistics related to join of column or relation. Rather these databases just multiply the statistics of individual column/relation to get the statistics of join, which may not be always correct. So for such case statistics based hints can be used to provides direct statistics of join of relation or columns.
  • 22. Statistics Hint: Example Lets say there is query as SELECT * FROM EMPLOYEE WHERE GRADE>5 AND SALARY > 10000; If we calculate independent stats for a and b. suppose sel(GRADE) = .2 and sel(SALARY) = .2; then sel (GRADE and SALARY) = sel(GRADE) * sel (SALARY) = .04. In all practical cases if we see, these two components will be highly be dependent i.e. first column satisfy, 2nd column will also satisfy. Then in that case sel (GRADE and SALARY) should be .2 not .04. But current optimizer will be incorrect in this case and may give wrong plan.
  • 23. Statistics Hint: Example Contd… The query with statistics hint will look like: SELECT /*+ SEL (GRADE and SALARY) AS 0.2* FROM EMPLOYEE WHERE GRADE>5 AND SALARY > 10000;
  • 24. Data Hint This kind of hints provides the information about the relationship/ dependency among relations or column to influence the plan instead of directly hinting to provide desired plan or direct selectivity value. Optimizer can consider dependency information to derive the actual selectivity.
  • 25. Data Hint: Example Lets say there is a query as SELECT * FROM TBL WHERE ID1 = 5 AND ID2=NULL; SELECT * FROM TBL WHERE ID1 = 5 AND ID2!=NULL; Now here if we specify that the dependency as “If TBL.ID1 = 5 then TBL.ID2 is NULL” then the optimizer will always consider this dependency pattern and accordingly combined statistics for these two columns can be choosen.
  • 26. Drawback 1. Required periodic maintenance to verify that hint supplied is still giving positive result 2. Interference with upgrades: today's helpful hints become anti- performance after an upgrade. 3. Encouraging bad DBA habits slap a hint on instead of figuring out the real issue.