Db2 sql tuning and bmc catalog manager

Overview
Understanding DB2 Optimizer
 SQL Coding Strategies & Guidelines
 Fliter Factor
 Stage1 & Stage 2 Predicates
 Explain table
 How to interpret the Explain Tables
Using Monitoring Tools to understand the performance of SQLs
 BMC Apptune
 BMC SQL Explorer

SQL Coding Strategies & Guidelines
SQL
Optimized
Access
Path
DB2 Optimizer
Cost - Based
Query
Cost
FormulasDB2
Catalog

Determines database navigation

Parses SQL statements for tables and columns which must be accessed

Queries statistics from DB2 Catalog (populated by RUNSTATS utility)

Determines least expensive access path

Checks Authorization
 The DB2 Optimizer is Cost Based and chooses the least expensive access path

 Avoid unnecessary execution of SQL
 Consider accomplishing as much as possible with a single call, so as to minimize table
access as far as possible.
 Limit the data selected (rows & columns) using SQL and avoid filtering using Application
programs.
 As far as possible, Code predicates on Indexable columns
 Use equivalent data types for comparison. This avoids the data type conversion overhead.
 JOIN tables on Indexed columns.
 Avoid Cartesian Products.
 The DISTINCT, ORDER BY, GROUP BY, UNION clauses involve a SORT operation. Use
these clauses only if absolutely necessary.

Cursor Usage Tips

Use Singleton SELECT statements, if you need to retrieve one row only. This
gives a far better performance than cursors.
SELECT …
INTO :<host variables>

Cursors should be used when you have more than one row to be retrieved.
Cursors have the overhead of OPEN, FETCH & CLOSE.

To update rows using a Cursor, use the FOR UPDATE OF clause.

Use FOR FETCH ONLY clause when the cursor is used for data retrieval only.
FOR READ ONLY clause provides the same functionality and is ODBC
compliant.

Use the WITH HOLD clause if you don’t want DB2 to automatically close the
cursor when the application issues a COMMIT statement.
Static Vs Dynamic SQL

The Access paths for Dynamic SQL is determined at run-time, which results in
additional overhead. Also, users need to have direct access to the tables.

The Access paths for Static SQL is determined at bind-time, and reused at run-
time. Users need only the EXECUTE access on the plan.

UNION and UNION ALL

The OR operator requires Stage 2 processing. Consider rewriting the query as
the union of two SELECT statements, making index access possible

UNION ALL allows duplicates, and hence does not involve a SORT.
The BETWEEN clause

BETWEEN is usually more efficient than using <= and >= operators, except
when comparing a host variable to 2 columns

Stage 2 : WHERE

:hostvar BETWEEN col1 and col2

Stage 1: WHERE

Col1 <= :hostvar AND col2 >= :hostvar

 Use IN Instead of Like

If you know that only a certain number of values exist and can be put in a
list
 Use IN or BETWEEN
 IN (‘Value1’, ‘Value2’, ‘Value3’)
 BETWEEN :valuelow AND :valuehigh
 Rather than:
 LIKE ‘Value_’
 Use LIKE With Care

Avoid the % or the _ at the beginning because it prevents DB2 from using
a matching index and may cause a scan

Use the % or the _ at the end to encourage index usage

Use NOT operator with care

Predicates formed using NOT (except NOT EXISTS) are Stage 1, but are not
indexable.

For Subquery - when using negation logic:
• Use NOT Exists instead of NOT IN
Code the Most Restrictive Predicate First

After the indexes, place the predicate that will eliminate the greatest number of
rows
Avoid Arithmetic in Predicates

An index is not used for a column when the column is in an arithmetic
expression.

Used at Stage 1 but not indexable

Nested loop join is efficient when

Outer table is small. Predicates with small filter factor reduces no of qualifying
rows in outer table.

The number of data pages accessed in inner table is also small.

Highly clustered index available on join columns of the inner table.

This join method is efficient when filtering for both the tables (Outer and inner) is
high.

This is the most common Join method.
Merge scan is used when :

Qualifying rows of inner and outer tables are large and join predicates also does
not provide much filtering

Tables are large and have no indexes with matching columns
Hybrid Join is used when:

A non-clustered index available on join column of the inner table and there are
duplicate qualifying rows on outer table.

Join Types & Join Predicate Considerations

Provide accurate JOIN predicates

Avoid JOIN without a predicate (Cartesian Join)

Join ON indexed columns

Use Joins over sub-queries

When the results of a join must be sorted -
 Limiting the ORDER BY to columns of a single table can sometimes avoid a
Sort
 Specifying columns from multiple tables definitely involve a Sort
 Favor coding LEFT OUTER joins over RIGHT OUTER joins as DB2 always
converts RIGHT joins to LEFT before executing it.

Sub-Query Guidelines
– If there are efficient indexes available on the tables in the subquery, then a
correlated subquery is likely to be the most efficient kind of subquery.
– If there are no efficient indexes available on the tables in the subquery, then
a non-correlated subquery would likely perform better.
– If there are multiple subqueries in any parent query, make sure that the
subqueries are ordered in the most efficient manner.

 Techniques for Performance Improvement

Use OPTIMIZE OF n ROWS
 DB2 assumes that only the said number of rows will be retrieved by
the query before choosing the access path. It is basically like giving a
Hint to the DB2 Optimizer.
 This does not stop the user from accessing the entire result set.
 This is not useful when DB2 has to gather whole result set before
returning the first n rows.
 With this clause, DB2 optimizes the query for quicker response.

Updating catalog tables
 If RUNSTATS is costly or it cannot be executed then catalog table
should be updated manually.
Enhanced Techniques for Performance Improvement

Influencing access path – Add extra Predicate
 DB2 evaluates the access path based on information available in
catalog tables
 Wrong catalog information or unavailable catalog information may
result in selection of wrong access path
 Wrong access path could be because of a wrong index selection or
it could also be of index selection where a tablespace scan is
effective
 Code extra predicates or change the predicate to make DB2 use a
different access path
 Adding extra predicate may also influence the selection of join
method
 If you have extra predicate, Nested loop join may be selected as
DB2 assumes that filter factor will be high. The proper type of
predicate to add is WHERE T1.C1 = T1.C1
 Hybrid join is a costlier method. Outer join does not use hybrid join.
So If hybrid join is used by DB2, convert inner join to outer join and
add extra predicates to remove unneeded rows.
Enhanced Techniques for Performance Improvement

General recommendations
Make sure that

The queries are as simple as possible

Unused rows are not fetched. Filtering to be done by DB2 not in the application
program.

Unused columns are not selected

There is no unnecessary ORDER BY or GROUP BY Clause

Use page level locking and try to minimize lock duration.

Mass updates should be avoided.

Try to use indexable predicates wherever possible

Do not code redundant predicates

Make sure that declared length of the host variable is not greater than length
attribute of data column.

If there are efficient indexes available on the tables in the subquery, co-related
subquery will perform better. Otherwise non co related subquery will perform better.
 If there are multiple subqueries, make sure that they are ordered in efficient
manner.
Summary

 Optimizer assigns a “Filter Factor” (FF) to each predicate or
predicate combination
– Number between 0 and 1 that provides the estimated filtering
percentage
 FF of 0.25 means 25% of the rows are estimated to qualify
– Calculated using available statistics from catalog tables
• Column cardinality (COLCARDF)
• HIGH2KEY/LOW2KEY
• Frequency statistics (FREQUENCYF in SYSCOLDIST)
Filter Factor (FF)

RUNSTATS
 RUNSTATS is a DB2 utility which provides catalog statistics used by the
optimizer and statistics related to the organization of an object
(TS / TB / IX / CO)
 Accurate Statistics are a critical factor for performance of the SQL.
 Updates the DB2 catalog and reports the statistics.
 Some catalog statistics updated by RUNSTATS for use by the optimizer can be
manually updated with appropriate authorization (SYSADM).

Stats Used for Access Path Determination
 SYSCOLDIST
– COLVALUE
– FREQUENCYF
– TYPE
– CARDF
– COLGROUPCOLNO
– NUMCOLUMNS
 SYSCOLUMNS
– COLCARDF
– HIGH2KEY
– LOW2KEY
 SYSINDEXES
– CLUSTERING
– CLUSTERRATIOF
– FIRSTKEYCARDF
– FULLKEYCARDFNLEAF
– NLEVELS

Stats Used for Access Path Determination
 SYSINDEXPART
– LIMITKEY
 SYSTABLES
– CARDF
– EDPROC
– NPAGES
– PCTROWCOMP

Stage 1 vs. Stage 2 Predicates
 Stage 1 predicates may use an available Index.
 Stage 2 predicates cannot use any Index.

Wherever possible, prefer to use Stage 1 (Sargable) predicates in the
where clause. These are conditions that can be evaluated in the Data
Manager of DB2, before the results are passed to Relational Data
System (RDS). The more conditions that can be evaluated early on, the
more efficient data retrieval is.
Stage 1-
Refers to DM( Data Manager)
A suitable index must exist!
Reduces I-O from disk and bufferpool activity
Stage 2 -
Refers to RDS ( Relational Data System)
Stage 1 vs. Stage 2 Predicates

How does the optimizer calculate Filter Factors?
 The lower the filter factor, the lower the cost. In general, the more efficient the
query will be

 A tool that shows the access path used by a query.
 Results of Explain stored in table PLAN_TABLE.
 Explain can be run for a query outside a program or for all
queries in a program.
 For all queries in a program: By using EXPLAIN(YES) parameter
during BIND.
Sample Explain Table Output
Explain

Explain
 Explain can be run at bind time using parm value of EXPLAIN(YES)
 A PLAN_TABLE must previously exist based on OWNER parm value on BIND or
current SQLID for dynamic SQL
 Explain can also be run against dynamic SQL
DELETE FROM PLAN_TABLE WHERE QUERYNO = 999;
EXPLAIN PLAN
SET QUERYNO = 999 FOR
<SELECT STATEMENT GOES HERE - USE ? IN PLACE OF HOST
VARIABLES>;
SELECT * FROM PLAN_TABLE WHERE
QUERYNO = 999 ORDER BY QBLOCKNO, PLANNO;
 Don’t forget to Explain everything
 Plan_Table is where all the tuning starts

 Non- Matching Index scan (ACCESSTYPE = I and MATCHCOLS = 0)
Scan all leaf pages of index selected by optimizer selecting one OR more
qualifying rows.
Scan can be with OR without data access.
Predicate does not match Leading columns in the index
SELECT COUNT(*) FROM TABLEA
SELECT MAX(COL1) FROM TABLEA
SELECT COL1 FROM TABLEA WHERE COL2 = :HV
Interpreting the Plan Table/Analyzing Access Paths

Non-Matching Index Scan Diagram
Root Page
Non-Leaf
Page 1
Non-Leaf
Page 2
Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4

 Matching Index scan (MATCHCOLS > 0)
Scan one or more leaf pages of index selected by optimizer selecting
one OR more qualifying rows. Index match based on one or more key
columns of selected index. Scan can be with OR without data
access.
Predicates matches leading columns of the index.
SELECT COL1 FROM TABLEA WHERE COL2 = :HV
SELECT COL2 FROM TABLEA WHERE COL1 = :HV (host variable
length longer than COL1)

Root Page
Non-Leaf
Page 1
Non-Leaf
Page 2
Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4
Data Page Data Page Data Page Data Page Data Page Data Page Data Page Data Page
Matching Index Scan Diagram

One Fetch Index Access (ACCESSTYPE = I1)
In certain circumstances can be THE most efficient access path in DB2.
May only need to access only 1 leaf page but MAY need to traverse index tree path.
Requires only one row be retrieved ( Min or Max column function)
SELECT MIN(COL1) FROM TABLEA
SELECT MIN(COL2) FROM TABLEA WHERE COL1 = :HV (will still be I1 BUT with
matchcols = 1)

 IN List Index Scan (ACCESSTYPE = N)

Scan one or more leaf pages of index selected by optimizer selecting one OR more
qualifying rows.

Index match based on one or more key columns of selected index.

At least one key column incorporates an IN list.
SELECT * FROM TABLEA WHERE COL1 = :HV
AND COL2 IN (‘A’,’B’,’C’)
SELECT COL3 FROM TABLEA WHERE COL1 IN (‘12345’,’56789’)
AND COL2 = :HV

Table-space scan (ACCESSTYPE = R)
Scan against partitioned tablespace or simple tablespace with one table scans all pages
including pages which are empty or contain purely deleted rows.
Scan against simple tablespace containing more than one table includes scanning of
tables within that tablespace not necessarily included in the query.
Scan against segmented tablespace includes only pages containing data.
SELECT * FROM TABLEA
SELECT * FROM TABLEA WHERE COL6 = 0
SELECT * FROM TABLEA WHERE COL1 <> :HV

Data Page 1 Data Page 2 Data Page 3 Data Page 4
Tablespace Scan Diagram

DB2 I/O Assisted Mechanisms
 Prefetch
To read data ahead in anticipation of its use. Prefetch can read up to 32 4K
pages for applications, and up to 64 4K pages for utilities.
 Sequential Prefetch
In DB2 UDB for OS/390, a mechanism that triggers consecutive asynchronous I/O
operations. Pages are fetched before they are required, and several pages are read
with a single I/O operation. This action is determined at bind time and can be detected
by a value of “S” in the prefetch column of the plan table. If index AND data are
required for the SQL, prefetch can occurs both object types.
 Dynamic Prefetch
Using the same approach as sequential prefetch, the mechanism is trigger at runtime if
DB2 detect that access to the index and/or data pages is sequential in nature but are
distributed |in a nonconsecutive manner .
 List Prefetch
An access method that takes advantage of prefetching even in queries that do not
access data sequentially. This is done by scanning the index and collecting RIDs in
advance of accessing any data pages. These RIDs are then sorted in page number
order, and then data is prefetched using this list.

DB2 Explain Columns
 QUERY Number –
Identifies the SQL statement in the PLAN_TABLE (any number you assign - the
example uses the numeric part of the userid)
 BLOCK –
Query block within the query number, where 1 is the top level SELECT. Subselects,
unions, materialized views, and nested table expressions will show multiple query
blocks. Each QBLOCK has it's own access path.
 PLAN –
Indicates the order in which the tables will be accessed

DB2 Explain Columns
 METHOD –
Shows which JOIN technique was used:
00- First table accessed, continuation of previous table accessed, or not used.
01- Nested Loop Join. For each row of the present composite table, matching rows of a
new table are found and joined
02- Merge Scan Join. The present composite table and the new table are scanned in the
order of the join columns, and matching rows are joined.
03- Sorts needed by ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified
predicate, or an IN predicate. This step does not access a new table.
04- Hybrid Join. The current composite table is scanned in the order of the join-column
rows of the new table. The new table accessed using list prefetch.

DB2 Explain Columns
 TNAME –
name of the table whose access this row refers to. Either a table in the FROM clause, or
a materialized VIEW name.
 TYPE (ACCESS TYPE) –
indicates whether an index was chosen:
 I = INDEX
 R = TABLESPACE SCAN (reads every data page of the table once)
 I1 = ONE-FETCH INDEX SCAN
 N = INDEX USING IN LIST
 M = MULTIPLE INDEX SCAN
 MX = NAMES ONE OF INDEXES USED
 MI = INTERSECT MULT. INDEXES
 MU = UNION MULT. INDEXES

DB2 Explain Columns
 MC (MATCHCOLS) - number of columns of matching index scan
 ANAME (ACCESS NAME) - name of index
 IO (INDEX ONLY) - Y = index alone satisfies data request
 N = table must be accessed also
 8 Sort Groups: Each sort group has four indicators indicating why the sort is
necessary. Usually, a sort will cause the statement to run longer.
 UNIQ - DISTINCT option or UNION was part of the query or IN list for subselect
 JOIN - sort for Join
 ORDERBY - order by option was part of the query
 GROUPBY - group by option was part of the query

DB2 Explain Columns
Sort flags for 'new' (inner) tables:
 SNU - SORTN_UNIQ - Y = remove duplicates, N = no sort
 SNJ - SORTN_JOIN - Y = sort table for join, N = no sort
 SNO - SORTN_ORDERBY - Y = sort for order by, N = no sort
 SNG - SORTN_GROUPBY - Y = sort for group by, N = no sort
Sort flags for 'composite' (outer) tables:
 SCU - SORTC_UNIQ - Y = remove duplicates, N = no sort
 SCJ - SORTC_JOIN - Y = sort table for join, N = no sort
 SCO - SORTC_ORDERBY - Y = sort for order by, N = no sort
 SCG - SORTC_GROUPBY - Y = sort for group by, N = no sort
 PF - PREFETCH - Indicates whether data pages were read in advance by prefetch.
 S = pure sequential PREFETCH
 L = PREFETCH through a RID list
 Blank = unknown, or not applicable

DB2 Explain Columns
 MIXOPSEQ The sequence number of a step in a multiple index operation.
 PAGE_RANGE Whether the table qualifies for page range screening, so that plans
scan only the partitions that are needed. Y = Yes; blank = No
 COLUMN_FN_EVAL: When an SQL aggregate function is evaluated. R = while the
data is being read from the table or index; S = while performing a sort to satisfy a
GROUP BY clause; blank =after data retrieval and after any sorts.
 QBLOCK_TYPE For each query block, an indication of the type of SQL operation
performed.
 JOIN_TYPE: The type of join:
F FULL OUTER JOIN
L LEFT OUTER JOIN
S STAR JOIN
blank INNER JOIN or no join
RIGHT OUTER JOIN converts to a LEFT OUTER JOIN when you
use it, so that JOIN_TYPE contains L.
EXPLAIN
Statements with examples.doc

Performance Tools Overview
 BMC APPTUNE
 BMC SQL EXPLORER

BMC APPTUNE
Use Option4-
Performance
Products

BMC APPTUNE
Use Option Q-
Apptune and
Index components

BMC APPTUNE
Option 1-
SQL
Workload

Setting Options in BMC APPTUNE
Use
Workload
Analysis
Choose
6. Data source
5. Time interval

Viewing Reports in APPTUNE
Use Various
Options To
Generate
Reports
Reports
Generated
for Programs

Viewing SQLs in APPTUNE
Use Option S-
To Show
SQLS
Use Option X-
To EXPLAIN
SQLS

Example of EXPLAIN Result in BMC APPTUNE
Cost
Calculated
by Optimizer
Matching
Index scan
Performed
Matching
Columns
used by index
Table &
Index names
Used by
access path

BMC SQL EXPLORER
Use Option S-
SQL Explorer
Use Option 1-
Explain

Setting Options in BMC SQL EXPLORER
Plans or
Packages or
DBRMS can
be analyzed
Package
options
Analysis run
in Batch
Mode

More references
BMC SQL
EXPLORER.doct
steps to get to
Apptune.doc

Run thru of an Actual
SQL Tuning Exercise

Set up Development Environment

Use Option 7
- Migrate
Access Path
Statistics
Example of the SQL Tuning Process - Development
Step 1.3: Import Statistics From Production to Development

Step 2: Identification of Problem SQL – Identify problem SQL
SQL
Statement
being
Analysed.
Tool warns
that
Cardinality is
missing.
Predicate
Mismatch is
also detected.

Step 2: Identification of Problem SQL – Check SQL Best Practices
No tool available for
checking Best
Practices. This
needs to be
manually checked
using the SQL Best
Practices document
already Published.
A snippet of the
related Best
Practice from the
SQL Guidelines
document.

Step 3: SQL Optimization – SQL Rewrite
No tool available to
automatically
rewrite SQL
statements. This
needs to be
manually rewritten
and subsequent
steps for Checking
the new Access
Path to be
performed.

Step 3: SQL Optimization – Compare Access paths
Access Paths can
be compared.
Notice the change
in Estimated
Indicative cost. A
different Index is
being used now.

Bibliography
 Redbooks at www.redbooks.ibm.com
DB2 UDB for z/OS V8 Everything you ever wanted to know… SG24-6079
DB2 UDB for z/OS V8 Performance Topics SG24-6465
DB2 for z/OS Application Design for High Performance and Availability SG24-7134
10/05
 DB2 UDB for Z/OS V8 Application Programming and SQL Guide
 SQL Tuning Best Practices & Guidelines Document
In the IM Project & Document Database Process Document section
1) Database 'IM Project and Document Database'
2) Select the ‘Process Document’ Section
3) Select ‘By Process Category’
4) Select ‘Best Practices’
5) View ‘Table of Contents '
6) Select document 'Database Access - SQL Tuning Best Practice & Guidelines’

Db2 sql tuning and bmc catalog manager

Db2 sql tuning and bmc catalog manager

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Db2 sql tuning and bmc catalog manager

Ähnlich wie Db2 sql tuning and bmc catalog manager (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Db2 sql tuning and bmc catalog manager