Sql good practices

SQL Good Practices
Basic Concepts
Deepak Mehtani

Background


RDBMS is not based on regular programing paradigm



It operates on mathematical concept of sets



Sets are groups that have union or intersection or – or X operations



Most misunderstood concept – sets are ordered?


No sets do not guarntee order – very important to know in terms of
RDBMS

11/7/2013

2

Strategic Imperatives


The approach should provide a stable implementation and ease of
management



Provide ability to adapt new design paradigms moving forward



Ease of maintenance for prior code by means of documentation and
code clarity

11/7/2013

3

What’s new in SQL Server 2012


Editions & licensing




Three versions – Standard, Business Intelligence and Enterprise

xVelocity – is Microsoft SQL Server's family of memory-optimized and
in-memory technologies. These are next-generation technologies built
for extreme speed on modern hardware systems with large memories
and many cores


xVelocity In-Memory Analytics Engine (used in PowerPivot and
Analysis Services)



xVelocity Memory-Optimized Columnstore Index (used in the SQL
Server database).



Self Bi – PowerView



Data compression – high performance



Data Quality - Maintain the quality of data and ensure that the data is
suited for business usage
11/7/2013

4

xVelocity – ColumnStore


Column store - In a column store, values from a single column (from multiple
rows) are stored contiguously, potentially in a compressed form



Relational database management systems traditionally store data in row-wise
fashion. The values comprising one row are stored contiguously on a page.
We sometimes refer to data stored in row-wise fashion as a row store



Columnstore index – In SQL Server, a columnstore index is data stored in
column-wise fashion that can be used to answer a query just like data in any
other type of index


A columnstore index appears as an index on a table when examining
catalog views or the Object Explorer in Management Studio



The query optimizer considers the columnstore index as a data source for
accessing data just like it considers other indexes when creating a query
plan

More information:
http://social.technet.microsoft.com/wiki/contents/articles/3540.sql-servercolumnstore-index-faq-en-us.aspx
11/7/2013

5

Power View


Power View is a feature of SQL Server 2012 Reporting Services that is an
interactive data exploration, visualization, and presentation experience



Provides intuitive ad-hoc reporting for business users such as data analysts,
business decision makers, and information workers



Users can easily create and interact with views of data from data models
based on PowerPivot workbooks published in a PowerPivot Gallery, or tabular
models deployed to SQL Server 2012 Analysis Services (SSAS) instances



Power View is a browser-based Silverlight application launched from
SharePoint Server 2010 that enables users to present and share insights with
others in their organization through interactive presentations

11/7/2013

6

Data Compression


DBA can compress tables and indexes to conserve storage space at a slight
CPU cost. One of the main design goals of data compression was to shrink
data warehouse fact tables



SQL Server provides two methods, Page and Row compression, to reduce data
storage on disk and speed I/O performance by reducing the amount of I/O
required for transaction processing. Page and row compression work in
different, yet complementary



Page compression uses an algorithm called “deduplication.” When
deduplicating, as the name implies, SQL Server looks for duplicate values that
appear again and again within the data page



Using page compression, SQL Server can remove such duplicate values within
a data page by replacing each duplicate value with a tiny pointer to a single
appearance of the full value



By comparison, row compression does not actually use a compression
algorithm per se. Instead, when row compression is enabled, SQL Server
simply removes any extra, unused bytes in a fixed data type column, such as a
CHAR(50) column



Page and row compression are not compatible, but by enabling page
compression SQL Server automatically includes row compression
11/7/2013

7

Data Quality Service


Aggregating data from different sources that use different data standards can
result in inconsistent data, as can applying an arbitrary rule or overwriting
historical data. Incorrect data affects the ability of a business to perform



Aggregating data from different sources that use different data standards can
result in inconsistent data, as can applying an arbitrary rule or overwriting
historical data. Incorrect data affects the ability of a business to perform



Features include


Data Cleansing – modification of incorrect or incomplete data



Matching – identification of semantic duplicates



Reference Data Services – verification of quality of data using
reference data provider



Profiling – analysis of a data source to provide insight into the quality of
the data



Monitoring – tracking and determination of the state of data quality
activities



Knowledge Base – Data Quality Services is a knowledge-driven
solution that analyzes data based upon knowledge that you build with
DQS.

11/7/2013

8

Layered Coding Approach


Standardized way for coding




Code level comments




Layered coding – database, business and user interface

Standardized approach on both application and database level

Primary focus of this presentation is at database designing and coding

11/7/2013

9

Resource Utilization


Non-scalable code


Only so many people can access at
the same time



This is caused by



Inefficient use of resources



11/7/2013

Resource locking

Unnecessary use of locks or
transactions that were never
committed or rolled back

10

Unstructured Data – Paradigm Shift


Non-scalable code


Not only reduces performance



Increases maintenance



Requires more space


Disk space



Disk defragmentation



Requires continuous log file
management, sizing and maintenance



Increased maintenance cost for the
database
11/7/2013

11

Basic Design – that we forget


Defining a table


Key points


Identity columns



Define primary key (if not easily defined then use identity
columns)



Define indices



Huge performance difference using index and primary key



Also helps in joining other tables

11/7/2013

12

Truncate or Delete


Truncate vs. Delete





Using truncate is better than
delete
Why?

What do you think is better for import
tables?

11/7/2013

13

Looping in SQL


Cursors


Necessary evil



Resource hog – disk, network bandwidth (for result transmission
line by line)



Use read only cursor, if there is no update with fast forward option
and auto fetch to get some performance gain




http://technet.microsoft.com/enus/library/aa172573(SQL.80).aspx

Avoid cursors as much as possible

11/7/2013

14

Transactions


Using Transactions


Helpful in recovering from a problem / unexpected situation



Transactions should start as late as possible in the procedure
and end as early as possible to reduce the locking time



Make sure transactions are always rolled back or committed



Handle error using @@error and roll back transactions in
such a case



Use “with Mark” option to add the name to the transaction log
that can be used as a restore point if needed


11/7/2013

http://msdn.microsoft.com/en-us/library/ms188929.aspx

15

Locks & Deadlocks


Deadlocks and using no locks



Deadlock occurs when two users have
lock on separate objects and each user
is trying to lock other user’s resource


These are automatically detected
and resolved by SQL with one of the
transactions rolled back with an
error code of 1205



Using no locks may be helpful



No locks does not lock a record for read
or write


Advantage?



Pit fall?



Read isolation transaction better
than no lock

11/7/2013

16

Temporary Tables


Temp tables



Using # tables vs. @tables


# tables – pros and cons?



@tables – pros and cons?



Which one should be used for a
stored procedure that is called
very frequently

11/7/2013

17

User Defined Functions


Using User Defined Functions (UDF)


UDF are useful for calculations or results based on some input



UDF are slower than built-in functions



What will happen if we use a UDF in a


Select statement?



Join condition?



Where clause?



Any of the above in a while loop or cursor?

11/7/2013

18

Capturing Error


Using @@error (also trapping errors at application level to close sql
connection and roll back transactions)



Checking @@error variable after and insert and / or update can help
us determining if we need to roll back or commit a transaction



Using Try-catch construct
BEGIN TRY
-- Generate divide-by-zero error.
SELECT 1/0;
END TRY
BEGIN CATCH
-- Execute error retrieval routine.
EXECUTE usp_GetErrorInfo;
END CATCH;

11/7/2013

19

Joins vs. exists or in and not exists or not in


Using left joins with null conditions instead of not exist



Example below
Select z.first_name+' '+z.last_name AS [name],
z.[user_id]
From users z
inner join usergroup zug ON z.[userid] = zug.[userid]
and (z.first_name +z.last_name is not null)
and z.[user_id] NOT IN (SELECT [user_id] FROM Superuser)
inner join groups zg ON zug.group_id = zg.group_id
Where zg.[name]='ADMINISTRATOR'
Order by [name]



We could probably replace the not in with a left join on a = null


Joins are faster than “in” and “not in” and use index

11/7/2013

20

Low hanging fruits


Should Select * be used more frequently?



Scope_Identity vs. select max?



Utilize query analyzer / management studio to view query plan


Analyze for optimization and potential scalability problems



Analyze any potential bottleneck / blocking, use sql trace



Avoid dynamic queries



Use joins potentially on primary key and / or indexed columns



Do not use SP_ in the name of a stored procedure




First reference is checked in master database

Using count(*) vs. Count(Primary key) column


Need to have an ID index

11/7/2013

21

Sub-Queries


Sub-Queries



Avoid sub-queries if possible, they can be a resource hog



ClaimBatch_listDetailXml:
(((@searchClaimType = 'ALL'AND @userId IN( SELECT UserId
FROM NSFClaimTrack nct WHERE nct.UserId = @userId

11/7/2013

22

Conclusion


These are guidelines that will help in


Building a robust product



Standardized way for all programmer



Ease of understanding of code



Clear logic understanding



Easy maintenance

11/7/2013

23

Sql good practices

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Sql good practices

Ähnlich wie Sql good practices (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Sql good practices