Query Optimization in SQL Server

Query Optimization
• We develop and deploy web apps. It is much faster in development
environment and in test server. However, the web app is
subsequently degrading in performance.
• When we investigating, we discovered that the production database
was performing extremely slowly when the application was trying to
access/update data.
• Looking into the database, we find that the database tables have
grown large in size and some of them were containing hundreds of
thousands of rows. We found that the submission process was taking
5 long minutes to complete, whereas it used to take only 2/3 seconds
to complete in the test server before production launch.
• Here comes query optimization

What is Indexing?
• A database index is a data structure that improves the speed of data
retrieval operations on a database table at the cost of additional
writes and storage space to maintain the index data structure.
• Indexes are used to quickly locate data without having to search
every row in a database table every time a database table is
accessed.
• Indexes can be created using one or more columns of a database
table, providing the basis for both rapid random lookups and efficient
access of ordered records.

Cluster & Non-Cluster Index
• Cluster Index will be created automatically when you add a Primary
Key column in a table. Eg., ProductID
• Only one cluster Index can be created for a table
• Non-Cluster Index will be created to non-primary key columns
• It is advisable to have maximum of 5 non-cluster index per table

Non-Cluster Index should be created to
Columns?
• Frequently used in the search criteria
• Used to join other tables
• Used as foreign key fields
• Of having high selectivity (column which returns a low percentage (0-
5%) of rows from a total number of rows on a particular value)
• Used in the ORDER BY clause
• Of type XML (primary and secondary indexes need to be created;
more on this in the coming articles)

Index Fragmentation
• Index fragmentation is a situation where index pages split due to
heavy insert, update, and delete operations on the tables in the
database. If indexes have high fragmentation, either
scanning/seeking the indexes takes much time, or the indexes are not
used at all (resulting in table scan) while executing queries. Thus, data
retrieval operations perform slow

Types of Index Fragmentation
• Internal Fragmentation: Occurs due to data deletion/update
operations in the index pages which end up in the distribution of data
as a sparse matrix in the index/data pages (creates lots of empty rows
in the pages). Also results in an increase of index/data pages that
increases query execution time.
• External Fragmentation: Occurs due to data insert/update operations
in the index/data pages which end up in page splitting and allocation
of new index/data pages that are not contiguous in the file system.
That reduces performance in determining the query result where
ranges are specified in the "where" clauses.

Defragmenting Indexes
Reorganize indexes: execute the following command to do this:
ALTER INDEX ALL ON TableName REORGANIZE
Rebuild indexes: execute the following command to do this:
ALTER INDEX ALL ON TableName REBUILD WITH
(FILLFACTOR=90,ONLINE=ON)
When to reorganize and when to rebuild indexes?
• You should "reorganize" indexes when the External Fragmentation
value for the corresponding index is between 10-15 and the Internal
Fragmentation value is between 60-75. Otherwise, you should rebuild
indexes.

Move T-SQL from App to Database
• We use ORM that generates all the SQL for us on the fly
• Moving SQL from application and implementing them using Stored
Procedures/Views/Functions/Triggers will enable you to eliminate
any duplicate SQL in your application. This will also ensure re-
usability of your TSQL codes.
• Implementing all TSQL using database objects will enable you to
analyse the TSQLs more easily to find possible inefficient codes that
are responsible for the slow performance. Also, this will let you
manage your TSQL codes from a central point.
• Doing this will also enable you to re-factor your TSQL codes to take
advantage of some advanced indexing techniques.

Identify inefficient TSQL, re-factor, and
apply best practices
• Avoid unnecessary columns in the SELECT list and unnecessary tables in join
conditions.
• Do not use the COUNT() aggregate in a subquery to do an existence check
• Avoid joining between two types of columns
• TSQL using "Set based approach" rather than "Procedural approach“(use of
Cursor or UDF to process rows in a result set)
• Avoid dynamic SQL
• Avoid the use of temporary tables
• Implement a lazy loading strategy for large objects
• Avoid the use of triggers
• Use views for re-using complex TSQL blocks. Do not use views that retrieve
data from a single table only

Query Execution Plan
• Whenever an SQL statement is issued in SQL Server engine, it first
determines the best possible way to execute it.
• The Query Optimizer (a system that generates the optimal query
execution plan before executing the query) uses several information
like the data distribution statistics, index structure, metadata, and
other information to analyse several possible execution plans and
finally select one that is likely to be the best execution plan most of
the time.
• We can use SQL Server Management Studio to preview and analyze
the estimated execution plan for the query that you are going to issue

Information Available on Query Execution
Plan
• Table Scan: Occurs when the corresponding table does not have a clustered index.
Most likely, creating a clustered index or defragmenting index will enable you to get
rid of it.
• Clustered Index Scan: Sometimes considered equivalent to Table Scan. Takes place
when a non-clustered index on an eligible column is not available. Most of the
time, creating a non-clustered index will enable you to get rid of it.
• Hash Join: The most expensive joining methodology. This takes place when the
joining columns between two tables are not indexed. Creating indexes on those
columns will enable you to get rid of it.
• Nested Loops: Most cases, this happens when a non-clustered index does not
include (Cover) a column that is used in the SELECT column list. In this case, for
each member in the non-clustered index column, the database server has to seek
into the clustered index to retrieve the other column value specified in the SELECT
list. Creating a covered index will enable you to get rid of it.
• RID Lookup: Takes place when you have a non-clustered index but the same table
does not have any clustered index. In this case, the database engine has to look up
the actual row using the row ID, which is an expensive operation. Creating a
clustered index on the corresponding table would enable you to get rid of it.

Steps in T-SQL Refactoring
• Analysing the indexes
• Analysing the query execution plan
• Implementing some best practices
• Implement computed columns and create indexes if necessary
• Create Views and Indexed Views if Necessary

Indexed Views
• Views don't give you any significant performance benefit
• Views are nothing but compiled queries, and Views just can't
remember any result set
• We can create indexed view so that it can remember the result set for
the SELECT query it is composed of
CREATE VIEW dbo.vOrderDetails
WITH SCHEMABINDING
AS
SELECT...

De-normalization
• If you are designing a database for an OLTA system (Online
Transaction Analytical system that is mainly a data warehouse which
is optimized for read-only queries), you should apply heavy de-
normalizing and indexing in your database. i.e., the same data will be
stored across different tables, but the reporting and data analytical
queries would run very faster.
• If you are designing a database for an OLTP system (Online
Transaction Processing System that is mainly a transactional system
where mostly data update operations take place [that is,
INSERT/UPDATE/DELETE]), implement at least 1st, 2nd, and 3rd
Normal forms so that we can minimize data redundancy, and thus
minimize data storage and increase manageability.

History Tables
• In an application, if we have some data retrieval operation (say,
reporting) that periodically runs on a time period, and if the process
involves tables that are large in size having normalized structure, we
can consider moving data periodically from transactional normalized
tables into a de-normalized, heavily indexed, single history table.
• We can also create a scheduled operation in database server that
would populate this history table at a specified time each day.
• If we do this, the periodic data retrieval operation then has to read
data only from a single table that is heavily indexed, and the
operation would perform a lot faster.

Happy Coding 
• Visit www.programmerguide.net
• Like www.facebook.com/programmerguide
• Follow www.twitter.com/programmerguide
- Rajesh Gunasundaram

Query Optimization in SQL Server

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (18)

Ähnlich wie Query Optimization in SQL Server

Ähnlich wie Query Optimization in SQL Server (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Query Optimization in SQL Server