SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Database
Performance
Tuning
WITH FOCUS ON SQL SERVER
BY ARNO HUETTER
About the Author
Arno Huetter
Arno wrote his first lines of code on a Sinclair ZX80 in
1984.
Over the years, he has been programming in C/C++,
Java and C#, and also did quite some database
development.
Today he is Development Lead at Dynatrace (APM
vendor).
Background
Background (Note: I am not a DBA. I only did some DB development)
 Introduction (1989):
 Phoenix DB (Atari ST, storage: 3.5” floppy)
 Learning (1992 - 1996):
 University (80% ER modelling, 20% SQL, 0% DB internals *sighs*), Contract work
 Oracle 5 (DOS), MS Access, 4th Dimension
 Professional Phase 1 (1997 - 2001, still learning):
 Internet Banking, Business Banking
 Oracle 7 (DEC Alpha), Sybase
 Professional Phase 2 (2002 - today, still learning):
 Hospital Information Systems, Finance/Accounting Software, APM
 Oracle 8/9 (Linux), SQL Server 2000/2005/2008/2012, Postgres
 Most concepts presented here are vendor-independent, but with "SQL Server flavour"
History
 1970: Edgar F. Codd (IBM) publishes paper "A Relational Model of Data for Large
Shared Data Banks".
 1974: Raymond Boyce and Donald Chamberlin (IBM) write "SEQUEL: A Structured English
Query Language".
 1974 - 1977: IBM implements System/R, UBC creates Ingres (later: Postgres), the first two
RDBMS.
 1976: Larry Ellison founds Oracle. Oracle's approach is based on Codd's IBM papers.
 1977: Oracle 1 runs on PDP-11, using 128k memory (never officially released).
 1978: IBM adds SQL to System/R. System/R eventually morphs into DB2.
 1979: Oracle releases the first commercially available SQL database.
And Big Data?
Which database systems are in use at your company?
How many rows can you insert per sec?
 Specification: SQL Server, row data on local client, 256 bytes per row, choose your table
design, provider, API. Now guess!
 On a highly-tuned setup (SSIS, split load / parallelization, special hw):
 1,000,000s of rows / sec
 On your off-the shelf notebook (bulk insert, heap table or suited clustered index):
 10,000s of rows / sec
 Worst case I ever encountered on a production system (thousands of roundtrips for thousands
rows within one transaction, poor clustered index choice and table design):
 15 rows / sec
Another real-life example
 Problem: Query takes 18 min to execute. Table design given (no major flaws)
 Original query:
 Joined every table that appears in the where clause, which led to cartesian product (lots of
duplicates on to-N associations); applied "distinct" to get rid of duplicates again in resultset
 Datatype conversion (e.g. datetime => varchar), prevented index application
 Invoked non-deterministic user defined function on every row (results can't be cached)
 Did not take advantage of existing indices (although possible)
 Refactored query:
 Replaced join duplicates / distinct by subqueries, ensured index seeks, fixed non-
deterministic UDFs
 Query now finishes in 200 ms, speedup 5,400-fold
Slow Queries and Indices
 Are indices the silver bullet? In many (trivial) cases: yes, but they can backfire on write
operations.
 Indices speed up data retrieval (no need to scan every row) at the cost of additional
writes and storage space. Also provide ordering, and can help to prevent locking.
 Implemented as B-Trees (self-balanced, logarithmic access time), nodes usually match
operating system I/O page size (e.g. 8k)
Indices
 Consider creating indices on columns used for narrowing where clauses and applied in
group-by, order-by and join expressions, which contain selective data (e.g. there is no
sense in indexing a "gender" column with two possible values), or which are used for
referential integrity checks.
 Consider creating composite indices for columns queried together. The index column order
is decisive for what can be looked up, e.g. phonebook: idx(lastname, firstname) will allow
seeking by "lastname = ... AND firstname = ...", by "lastname = ...", but not by "firstname = ...".
Multiple single-column indices in contrast require multiple separate lookups and merging the
results.
 Make your index unique if that fits your data model. This helps to furthermore optimize query
execution.
 Indices should be kept small. Indexing a larger varchar column is probably not a good idea.
Indices
 Indices have fill factors (used for leaving space in nodes to avoid frequent node splits),
typically between 70% (high insert rate) and 90% (low insert rate). Fill factors are applied on
index rebuilds. Index rebuilds must be scheduled by the DBA.
 Each table has zero or one clustered index definition (by default: the primary key). The
clustered index is a b-tree that contains the actual row data in its leaves. If there is no clustered
index, we talk about a heap table where rows are simply appended at the end.
Clustered Index
Indices
 If the query optimizer would have to seek on an index over and over during a query, it may
decide to do one index scan instead of many index seeks.
 Index seeks can not be applied on
 type <> 3 -- negative search
 lastname like '%...' -- '%' prepended
 lastname + ' ' + firstname = '...' -- concatenation
-- col expr idx helps
 CAST(FLOOR(CAST(date AS FLOAT)) AS DATETIME) > ... -- function / cast
 An index contains the clustered index columns for quick lookup of actual data in clustered
index. So this is one indirection, except for...
 ... if an index contains all columns the query needs, the clustered index is not required for
retrieval.
Indices - The Drawback
 Over-indexing is a problem. Indices must be written on inserts, updates, deletes, this
can cost dearly.
 The choice of the clustered index is an essential factor for performance, as too many
node splits should be prevented, esp. on huge bulk inserts and updates.
 Autoinc values or a growing date are good choices for clustered indices as they only
fill up the final leaf. Guids are bad as they spread all over the index.
 SQL Server introduced newsequentialid() for creating sequential Guids and preventing
excessive node splitting.
 Each single row insert leads to a clustered index insert and N non-clustered index inserts.
Only create indices that are absolutely necessary for query performance. Prefer one
composite index to multiple single-columns indices where applicable.
 Superfast insert approach: Insert into a temporary heap table first (no indices, not even
clustered => always appended at the end), then issue an "insert-into-select" from the heap
table into the target table, ordering by target table clustered index.
Query Tuning
 Avoid join duplicates / cartesian products on to-N associations where not required for
the resultset. Often joins can be replaced by subqueries, e.g.:
 where exists (select 1 from ...)
 Prevent the N+1 query problem on to-N associations. Typically caused by applying OR-
mappers the wrong way, but sometimes even implemented explicitly. Never run a query
within a loop.
 Keep queries simple. If a query is overly complicated, chances are its execution is
complicated too. Sometimes it's advisable to not pack everything into one single query, but
issue two or three consecutive queries. One possibility to pass data between queries is by
using temp tables.
 Have a look at the execution plan and verify it looks as expected, e.g. how indices are
applied. Hint: an "index scan" is not the same as an "index seek".
Execution Plan (Demo)
Query Tuning
 Execution plans are cached per statement. But: On an expression like this
(selectiveness of a parameter varies heavily) reusing the same plan can kill performance:
 where (lastname = @lastname or @lastname is null)
 Query optimizer uses table statistics to choose an execution plan. Table statistics contain
metadata on column value distribution, etc. Not every column has statistic data by default,
but indices do. Statistic updates usually happen during index rebuild, or can be scheduled
by the DBA. Go sure table statistics are up to date.
Transactions, ACID and Locking
 A transaction symbolizes a unit of work performed against a database,
and treated in a coherent and reliable way independent of other
transactions.
 There is always a transaction running. Statements without having an
explicit transaction are executed within a "single-statement" transaction.
 ACID is a set of properties that guarantee that database transactions are
processed reliably.
 Locks are a means to implement ACID. Different operations require
different kinds of locks (simplified: shared (read), update (potential write),
exclusive (write)). They are acquired and released depending on the
isolation level (serializable, repeated read, read committed, read
uncommitted), and only granted if the current lock state allows for it.
Otherwise the execution blocks until the lock can be obtained. Locks are
applied on a row-, page- or table-level, and on indices.
Transactions and Lock Tuning
 Keep transactions short as possible, as this reduces lock contention. Always commit or
rollback transactions immediately. Never wait for external input (worst case: waiting for user
interaction).
 Ensure that indices are being used. An index seek is more likely to prevent locking (row
locks can be bypassed, and index locks have much less contention).
 Statements can provide specific lock hints (e.g. "with nolock") in case the default locking
behaviour can be mitigated.
 As far as possible, put queries at the beginning and inserts/updates/deletes at the end.
Start with the least congested tables, and end with the most congested ones.
 Deadlock prevention: Try to access resources in the same order. DBs can detect deadlocks,
and will choose one deadlock victim transaction for rollback.
 The DB keeps a transaction log for rollbacks, handling ungraceful shutdowns and incremental
backups. The transaction log should be on a dedicated physical disk (separate from data
files), with an optimized setup.
Transactions and Locking (Demo)
Indexed Views
 Design your schema for normalization, then de-normalize for speed, e.g. for complex
join constructs on huge tables and/or a lot of aggregated data.
 Radical? But what if the DB would guarantee data consistency on such de-normalized
tables?
 Actually that functionality exists: Indexed Views (Materialized Views) to the rescue!
 By creating a unique clustered index on a view, the view gets "materialized", having its flat
data redundantly stored to the DB. One can then add more indices to the view.
 Modifications made to base tables trigger modifications in the indexed view. This leads to a
similar drawback as with indices: Indexed views are fast for queries, but come at a
performance penalty for write operations, and require additional storage space. Hint: Put
an index on the base tables' primary key columns on the indexed view for quick lookup on
updates and deletes.
Table Partitioning
 Data is divided into units that can be spread across multiple nodes / filegroups /
disks. This allows more parallel processing and improves I/O performance.
 The partitioned table is treated as a single logical entity when queries or updates are
performed.
 A common approach is to use an autoinc primary key or a growing date column as
partition criteria. This often helps to have read and write operations occur on different data
ranges, hence different partitions.
 Maintenance operations like index rebuilds or purging old data are also faster when running
on a per-partition basis.
 Only makes sense for really large tables with certain data growth, and where queries are of
a kind to benefit from partitioning.
Table Partitioning
More Tuning
 Use bulk / batch SQL statements in order to avoid unnecessary server roundtrips.
 Prefer to move data within the database (e.g. temp tables, insert-into-select) instead of
back and forth from the client.
 Implement and invoke stored procedures (sometimes questionable from a design
perspective).
 Use Activity Monitor, Profiler, Tuning Advisor, dynamic management views / dynamic
performance views and other monitoring tools.
 Put data files, tempdb files and transaction logs on separate physical disks, if necessary
even single heavily-used tables.
 Historically most RDBMs provided clustering mainly for failover via mirroring / data
replication. Several cluster solutions have since been extended to improve scalability as
well, e.g. Oracle RAC. On these scaling cluster systems nodes still share the same storage
(node sync requires fast cluster interconnect).
O/R-Mappers: Hibernate Tuning
 Avoid join duplicates (AKA cartesian products) due to joins along two or more parallel to-
many associations; use Exists-subqueries, multiple queries or fetch="subselect"
instead - whatever is most appropriate in the specific situation. Join duplicates are already
pretty bad in plain SQL, but things get even worse when they occur within Hibernate,
because of unnecessary mapping workload and child collections containing duplicates.
 Define lazy loading as the default association loading strategy, and consider applying
fetch="subselect" rather than "select" resp. "batch-size". Configure eager loading only for
special associations, but join-fetch selectively on a per-query basis.
 In case of read-only services with huge query resultsets, use projections and fetch into
flat DTOs (e.g. via AliasToBean-ResultTransformer), instead of loading thousands of
mapped objects into the Session.
O/R-Mappers: Hibernate Tuning
 Set ReadOnly to "true" on Queries and Criteria, when objects will never be modified.
 Consider clearing the whole Session after flushing, or evict on a per-object basis, once
objects are not longer needed.
 Define a suitable value for jdbc.batch_size (resp. adonet.batch_size).
 Use Hibernate Query-Cache and Second Level Caching where appropriate (but go sure
you are aware of the consequences).
 Set hibernate.show_sql to "false" and ensure that Hibernate logging is running at the
lowest possible loglevel (also check log4j/log4net root logger configuration).
Tools: SQL Server Activity Monitor (Demo)
Tools: SQL Server Profiler (Demo)
Tools: SQL Server Tuning Advisor
Hardware
 Rules of thumb for server hardware are difficult, it depends heavily how much "hot data"
is moved around, and on query load. Do your math and plan, measure KPIs (e.g. via SQL
Server Perfcounters) and adjust accordingly.
 RAM: it's cheap, get as much as you can. I/O often is a bottleneck, e.g. misconfigured
SANs can kill performance. Use HW RAID. CPU: Enterprise editions can take advantage of
as much as the OS CPU core maximum.
 Let's have a look at a real life example - stackoverflow.com:
 SQL Server failover cluster, 2 nodes (plus one identical setup at another data center for even
more redundancy)
 Dell R730xd server
 768GB RAM (the complete data can be held in memory)
 6TB PCIe SSD
 16 cores
Thank you!
Twitter: https://twitter.com/ArnoHu
Blog: http://arnosoftwaredev.blogspot.com

Weitere ähnliche Inhalte

Was ist angesagt?

Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance TuningBala Subra
 
Introduction to structured query language (sql)
Introduction to structured query language (sql)Introduction to structured query language (sql)
Introduction to structured query language (sql)Dhani Ahmad
 
Database Consolidation using the Oracle Multitenant Architecture
Database Consolidation using the Oracle Multitenant ArchitectureDatabase Consolidation using the Oracle Multitenant Architecture
Database Consolidation using the Oracle Multitenant ArchitecturePini Dibask
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - PresentationThe Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - PresentationMarkus Michalewicz
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Oracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cOracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cNabeel Yoosuf
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsDB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsJohn Beresniewicz
 
database recovery techniques
database recovery techniques database recovery techniques
database recovery techniques Kalhan Liyanage
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabaseTung Nguyen Thanh
 

Was ist angesagt? (20)

Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
 
Introduction to structured query language (sql)
Introduction to structured query language (sql)Introduction to structured query language (sql)
Introduction to structured query language (sql)
 
Database Consolidation using the Oracle Multitenant Architecture
Database Consolidation using the Oracle Multitenant ArchitectureDatabase Consolidation using the Oracle Multitenant Architecture
Database Consolidation using the Oracle Multitenant Architecture
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - PresentationThe Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - Presentation
 
Voldemort
VoldemortVoldemort
Voldemort
 
Parallel Database
Parallel DatabaseParallel Database
Parallel Database
 
SQL Views
SQL ViewsSQL Views
SQL Views
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Oracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cOracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12c
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentalsDB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
database recovery techniques
database recovery techniques database recovery techniques
database recovery techniques
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Performance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL DatabasePerformance Tuning And Optimization Microsoft SQL Database
Performance Tuning And Optimization Microsoft SQL Database
 
Index in sql server
Index in sql serverIndex in sql server
Index in sql server
 

Ähnlich wie Database Performance Tuning

Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09guest9d79e073
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Mark Ginnebaugh
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
 
The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesTilak Patidar
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Ravi Okade
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and ElasticsearchDean Hamstead
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testingsmittal81
 
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresArchitectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresGleicon Moraes
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack PresentationAmr Alaa Yassen
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
The life of a query (oracle edition)
The life of a query (oracle edition)The life of a query (oracle edition)
The life of a query (oracle edition)maclean liu
 
Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008paulguerin
 
Optimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleOptimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleRightScale
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007paulguerin
 

Ähnlich wie Database Performance Tuning (20)

Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databases
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testing
 
SQL Server 2012 Best Practices
SQL Server 2012 Best PracticesSQL Server 2012 Best Practices
SQL Server 2012 Best Practices
 
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresArchitectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
The life of a query (oracle edition)
The life of a query (oracle edition)The life of a query (oracle edition)
The life of a query (oracle edition)
 
Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008Myth busters - performance tuning 103 2008
Myth busters - performance tuning 103 2008
 
Optimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScaleOptimizing Your Cloud Applications in RightScale
Optimizing Your Cloud Applications in RightScale
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 

Mehr von Arno Huetter

Chess Engine Programming
Chess Engine ProgrammingChess Engine Programming
Chess Engine ProgrammingArno Huetter
 
The world's most famous programmers
The world's most famous programmersThe world's most famous programmers
The world's most famous programmersArno Huetter
 
Geschichte des Computers (1991)
Geschichte des Computers (1991)Geschichte des Computers (1991)
Geschichte des Computers (1991)Arno Huetter
 
Grundlagen der Volkswirtschaftslehre (1993)
Grundlagen der Volkswirtschaftslehre (1993)Grundlagen der Volkswirtschaftslehre (1993)
Grundlagen der Volkswirtschaftslehre (1993)Arno Huetter
 
Diplomarbeit: Software Reengineering (1995)
Diplomarbeit: Software Reengineering (1995)Diplomarbeit: Software Reengineering (1995)
Diplomarbeit: Software Reengineering (1995)Arno Huetter
 
Diplomarbeit: Generische und dynamische Hypertexte (2001)
Diplomarbeit: Generische und dynamische Hypertexte (2001)Diplomarbeit: Generische und dynamische Hypertexte (2001)
Diplomarbeit: Generische und dynamische Hypertexte (2001)Arno Huetter
 
Leading Software Development Teams
Leading Software Development TeamsLeading Software Development Teams
Leading Software Development TeamsArno Huetter
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbgArno Huetter
 
Software Disasters
Software DisastersSoftware Disasters
Software DisastersArno Huetter
 
The History of the PC
The History of the PCThe History of the PC
The History of the PCArno Huetter
 
Führen von Software-Entwicklungsteams
Führen von Software-EntwicklungsteamsFühren von Software-Entwicklungsteams
Führen von Software-EntwicklungsteamsArno Huetter
 

Mehr von Arno Huetter (13)

Chess Engine Programming
Chess Engine ProgrammingChess Engine Programming
Chess Engine Programming
 
Abraham Lincoln
Abraham LincolnAbraham Lincoln
Abraham Lincoln
 
Augustus
AugustusAugustus
Augustus
 
The world's most famous programmers
The world's most famous programmersThe world's most famous programmers
The world's most famous programmers
 
Geschichte des Computers (1991)
Geschichte des Computers (1991)Geschichte des Computers (1991)
Geschichte des Computers (1991)
 
Grundlagen der Volkswirtschaftslehre (1993)
Grundlagen der Volkswirtschaftslehre (1993)Grundlagen der Volkswirtschaftslehre (1993)
Grundlagen der Volkswirtschaftslehre (1993)
 
Diplomarbeit: Software Reengineering (1995)
Diplomarbeit: Software Reengineering (1995)Diplomarbeit: Software Reengineering (1995)
Diplomarbeit: Software Reengineering (1995)
 
Diplomarbeit: Generische und dynamische Hypertexte (2001)
Diplomarbeit: Generische und dynamische Hypertexte (2001)Diplomarbeit: Generische und dynamische Hypertexte (2001)
Diplomarbeit: Generische und dynamische Hypertexte (2001)
 
Leading Software Development Teams
Leading Software Development TeamsLeading Software Development Teams
Leading Software Development Teams
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbg
 
Software Disasters
Software DisastersSoftware Disasters
Software Disasters
 
The History of the PC
The History of the PCThe History of the PC
The History of the PC
 
Führen von Software-Entwicklungsteams
Führen von Software-EntwicklungsteamsFühren von Software-Entwicklungsteams
Führen von Software-Entwicklungsteams
 

Kürzlich hochgeladen

8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 

Kürzlich hochgeladen (20)

8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 

Database Performance Tuning

  • 1. Database Performance Tuning WITH FOCUS ON SQL SERVER BY ARNO HUETTER
  • 2. About the Author Arno Huetter Arno wrote his first lines of code on a Sinclair ZX80 in 1984. Over the years, he has been programming in C/C++, Java and C#, and also did quite some database development. Today he is Development Lead at Dynatrace (APM vendor).
  • 4. Background (Note: I am not a DBA. I only did some DB development)  Introduction (1989):  Phoenix DB (Atari ST, storage: 3.5” floppy)  Learning (1992 - 1996):  University (80% ER modelling, 20% SQL, 0% DB internals *sighs*), Contract work  Oracle 5 (DOS), MS Access, 4th Dimension  Professional Phase 1 (1997 - 2001, still learning):  Internet Banking, Business Banking  Oracle 7 (DEC Alpha), Sybase  Professional Phase 2 (2002 - today, still learning):  Hospital Information Systems, Finance/Accounting Software, APM  Oracle 8/9 (Linux), SQL Server 2000/2005/2008/2012, Postgres  Most concepts presented here are vendor-independent, but with "SQL Server flavour"
  • 5. History  1970: Edgar F. Codd (IBM) publishes paper "A Relational Model of Data for Large Shared Data Banks".  1974: Raymond Boyce and Donald Chamberlin (IBM) write "SEQUEL: A Structured English Query Language".  1974 - 1977: IBM implements System/R, UBC creates Ingres (later: Postgres), the first two RDBMS.  1976: Larry Ellison founds Oracle. Oracle's approach is based on Codd's IBM papers.  1977: Oracle 1 runs on PDP-11, using 128k memory (never officially released).  1978: IBM adds SQL to System/R. System/R eventually morphs into DB2.  1979: Oracle releases the first commercially available SQL database.
  • 6. And Big Data? Which database systems are in use at your company?
  • 7. How many rows can you insert per sec?  Specification: SQL Server, row data on local client, 256 bytes per row, choose your table design, provider, API. Now guess!  On a highly-tuned setup (SSIS, split load / parallelization, special hw):  1,000,000s of rows / sec  On your off-the shelf notebook (bulk insert, heap table or suited clustered index):  10,000s of rows / sec  Worst case I ever encountered on a production system (thousands of roundtrips for thousands rows within one transaction, poor clustered index choice and table design):  15 rows / sec
  • 8. Another real-life example  Problem: Query takes 18 min to execute. Table design given (no major flaws)  Original query:  Joined every table that appears in the where clause, which led to cartesian product (lots of duplicates on to-N associations); applied "distinct" to get rid of duplicates again in resultset  Datatype conversion (e.g. datetime => varchar), prevented index application  Invoked non-deterministic user defined function on every row (results can't be cached)  Did not take advantage of existing indices (although possible)  Refactored query:  Replaced join duplicates / distinct by subqueries, ensured index seeks, fixed non- deterministic UDFs  Query now finishes in 200 ms, speedup 5,400-fold
  • 9. Slow Queries and Indices  Are indices the silver bullet? In many (trivial) cases: yes, but they can backfire on write operations.  Indices speed up data retrieval (no need to scan every row) at the cost of additional writes and storage space. Also provide ordering, and can help to prevent locking.  Implemented as B-Trees (self-balanced, logarithmic access time), nodes usually match operating system I/O page size (e.g. 8k)
  • 10. Indices  Consider creating indices on columns used for narrowing where clauses and applied in group-by, order-by and join expressions, which contain selective data (e.g. there is no sense in indexing a "gender" column with two possible values), or which are used for referential integrity checks.  Consider creating composite indices for columns queried together. The index column order is decisive for what can be looked up, e.g. phonebook: idx(lastname, firstname) will allow seeking by "lastname = ... AND firstname = ...", by "lastname = ...", but not by "firstname = ...". Multiple single-column indices in contrast require multiple separate lookups and merging the results.  Make your index unique if that fits your data model. This helps to furthermore optimize query execution.  Indices should be kept small. Indexing a larger varchar column is probably not a good idea.
  • 11. Indices  Indices have fill factors (used for leaving space in nodes to avoid frequent node splits), typically between 70% (high insert rate) and 90% (low insert rate). Fill factors are applied on index rebuilds. Index rebuilds must be scheduled by the DBA.  Each table has zero or one clustered index definition (by default: the primary key). The clustered index is a b-tree that contains the actual row data in its leaves. If there is no clustered index, we talk about a heap table where rows are simply appended at the end.
  • 13. Indices  If the query optimizer would have to seek on an index over and over during a query, it may decide to do one index scan instead of many index seeks.  Index seeks can not be applied on  type <> 3 -- negative search  lastname like '%...' -- '%' prepended  lastname + ' ' + firstname = '...' -- concatenation -- col expr idx helps  CAST(FLOOR(CAST(date AS FLOAT)) AS DATETIME) > ... -- function / cast  An index contains the clustered index columns for quick lookup of actual data in clustered index. So this is one indirection, except for...  ... if an index contains all columns the query needs, the clustered index is not required for retrieval.
  • 14. Indices - The Drawback  Over-indexing is a problem. Indices must be written on inserts, updates, deletes, this can cost dearly.  The choice of the clustered index is an essential factor for performance, as too many node splits should be prevented, esp. on huge bulk inserts and updates.  Autoinc values or a growing date are good choices for clustered indices as they only fill up the final leaf. Guids are bad as they spread all over the index.  SQL Server introduced newsequentialid() for creating sequential Guids and preventing excessive node splitting.  Each single row insert leads to a clustered index insert and N non-clustered index inserts. Only create indices that are absolutely necessary for query performance. Prefer one composite index to multiple single-columns indices where applicable.  Superfast insert approach: Insert into a temporary heap table first (no indices, not even clustered => always appended at the end), then issue an "insert-into-select" from the heap table into the target table, ordering by target table clustered index.
  • 15. Query Tuning  Avoid join duplicates / cartesian products on to-N associations where not required for the resultset. Often joins can be replaced by subqueries, e.g.:  where exists (select 1 from ...)  Prevent the N+1 query problem on to-N associations. Typically caused by applying OR- mappers the wrong way, but sometimes even implemented explicitly. Never run a query within a loop.  Keep queries simple. If a query is overly complicated, chances are its execution is complicated too. Sometimes it's advisable to not pack everything into one single query, but issue two or three consecutive queries. One possibility to pass data between queries is by using temp tables.  Have a look at the execution plan and verify it looks as expected, e.g. how indices are applied. Hint: an "index scan" is not the same as an "index seek".
  • 17. Query Tuning  Execution plans are cached per statement. But: On an expression like this (selectiveness of a parameter varies heavily) reusing the same plan can kill performance:  where (lastname = @lastname or @lastname is null)  Query optimizer uses table statistics to choose an execution plan. Table statistics contain metadata on column value distribution, etc. Not every column has statistic data by default, but indices do. Statistic updates usually happen during index rebuild, or can be scheduled by the DBA. Go sure table statistics are up to date.
  • 18. Transactions, ACID and Locking  A transaction symbolizes a unit of work performed against a database, and treated in a coherent and reliable way independent of other transactions.  There is always a transaction running. Statements without having an explicit transaction are executed within a "single-statement" transaction.  ACID is a set of properties that guarantee that database transactions are processed reliably.  Locks are a means to implement ACID. Different operations require different kinds of locks (simplified: shared (read), update (potential write), exclusive (write)). They are acquired and released depending on the isolation level (serializable, repeated read, read committed, read uncommitted), and only granted if the current lock state allows for it. Otherwise the execution blocks until the lock can be obtained. Locks are applied on a row-, page- or table-level, and on indices.
  • 19. Transactions and Lock Tuning  Keep transactions short as possible, as this reduces lock contention. Always commit or rollback transactions immediately. Never wait for external input (worst case: waiting for user interaction).  Ensure that indices are being used. An index seek is more likely to prevent locking (row locks can be bypassed, and index locks have much less contention).  Statements can provide specific lock hints (e.g. "with nolock") in case the default locking behaviour can be mitigated.  As far as possible, put queries at the beginning and inserts/updates/deletes at the end. Start with the least congested tables, and end with the most congested ones.  Deadlock prevention: Try to access resources in the same order. DBs can detect deadlocks, and will choose one deadlock victim transaction for rollback.  The DB keeps a transaction log for rollbacks, handling ungraceful shutdowns and incremental backups. The transaction log should be on a dedicated physical disk (separate from data files), with an optimized setup.
  • 21. Indexed Views  Design your schema for normalization, then de-normalize for speed, e.g. for complex join constructs on huge tables and/or a lot of aggregated data.  Radical? But what if the DB would guarantee data consistency on such de-normalized tables?  Actually that functionality exists: Indexed Views (Materialized Views) to the rescue!  By creating a unique clustered index on a view, the view gets "materialized", having its flat data redundantly stored to the DB. One can then add more indices to the view.  Modifications made to base tables trigger modifications in the indexed view. This leads to a similar drawback as with indices: Indexed views are fast for queries, but come at a performance penalty for write operations, and require additional storage space. Hint: Put an index on the base tables' primary key columns on the indexed view for quick lookup on updates and deletes.
  • 22. Table Partitioning  Data is divided into units that can be spread across multiple nodes / filegroups / disks. This allows more parallel processing and improves I/O performance.  The partitioned table is treated as a single logical entity when queries or updates are performed.  A common approach is to use an autoinc primary key or a growing date column as partition criteria. This often helps to have read and write operations occur on different data ranges, hence different partitions.  Maintenance operations like index rebuilds or purging old data are also faster when running on a per-partition basis.  Only makes sense for really large tables with certain data growth, and where queries are of a kind to benefit from partitioning.
  • 24. More Tuning  Use bulk / batch SQL statements in order to avoid unnecessary server roundtrips.  Prefer to move data within the database (e.g. temp tables, insert-into-select) instead of back and forth from the client.  Implement and invoke stored procedures (sometimes questionable from a design perspective).  Use Activity Monitor, Profiler, Tuning Advisor, dynamic management views / dynamic performance views and other monitoring tools.  Put data files, tempdb files and transaction logs on separate physical disks, if necessary even single heavily-used tables.  Historically most RDBMs provided clustering mainly for failover via mirroring / data replication. Several cluster solutions have since been extended to improve scalability as well, e.g. Oracle RAC. On these scaling cluster systems nodes still share the same storage (node sync requires fast cluster interconnect).
  • 25. O/R-Mappers: Hibernate Tuning  Avoid join duplicates (AKA cartesian products) due to joins along two or more parallel to- many associations; use Exists-subqueries, multiple queries or fetch="subselect" instead - whatever is most appropriate in the specific situation. Join duplicates are already pretty bad in plain SQL, but things get even worse when they occur within Hibernate, because of unnecessary mapping workload and child collections containing duplicates.  Define lazy loading as the default association loading strategy, and consider applying fetch="subselect" rather than "select" resp. "batch-size". Configure eager loading only for special associations, but join-fetch selectively on a per-query basis.  In case of read-only services with huge query resultsets, use projections and fetch into flat DTOs (e.g. via AliasToBean-ResultTransformer), instead of loading thousands of mapped objects into the Session.
  • 26. O/R-Mappers: Hibernate Tuning  Set ReadOnly to "true" on Queries and Criteria, when objects will never be modified.  Consider clearing the whole Session after flushing, or evict on a per-object basis, once objects are not longer needed.  Define a suitable value for jdbc.batch_size (resp. adonet.batch_size).  Use Hibernate Query-Cache and Second Level Caching where appropriate (but go sure you are aware of the consequences).  Set hibernate.show_sql to "false" and ensure that Hibernate logging is running at the lowest possible loglevel (also check log4j/log4net root logger configuration).
  • 27. Tools: SQL Server Activity Monitor (Demo)
  • 28. Tools: SQL Server Profiler (Demo)
  • 29. Tools: SQL Server Tuning Advisor
  • 30. Hardware  Rules of thumb for server hardware are difficult, it depends heavily how much "hot data" is moved around, and on query load. Do your math and plan, measure KPIs (e.g. via SQL Server Perfcounters) and adjust accordingly.  RAM: it's cheap, get as much as you can. I/O often is a bottleneck, e.g. misconfigured SANs can kill performance. Use HW RAID. CPU: Enterprise editions can take advantage of as much as the OS CPU core maximum.  Let's have a look at a real life example - stackoverflow.com:  SQL Server failover cluster, 2 nodes (plus one identical setup at another data center for even more redundancy)  Dell R730xd server  768GB RAM (the complete data can be held in memory)  6TB PCIe SSD  16 cores
  • 31. Thank you! Twitter: https://twitter.com/ArnoHu Blog: http://arnosoftwaredev.blogspot.com