http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
Takeaway #1: Pandemic Scale
It affects you!
(Symbolic image; not real data)
http://upload.wikimedia.org/wikipedia/commons/c...
Takeaway #2: Caused by Success
Copyright © 2013 Telerik, Inc. All rights reserved
Takeaway #3: It’s Not Your Fault
http://simpsonswiki.com/wiki/File:I_Didn%27t_Do_It!_Volume_III.png
© 2017 by Markus Winand
The Problem
Index/Query Mismatch
The Problem: Index/Query Mismatch
“A very common cause of performance
problems is lack of proper indexes or the
use of que...
The Problem: Index/Query Mismatch
“A very common cause of performance
problems is lack of proper indexes or the
use of que...
Quantifying the Problem
Quantifying the Problem
Percona White Paper:
Reasons of performance problems

that caused production downtime:
38% bad SQL...
Quantifying the Problem
Percona White Paper:
Reasons of performance problems

that caused production downtime:
38% bad SQL...
Quantifying the Problem
Percona White Paper:
Reasons of performance problems

that caused production downtime:
38% bad SQL...
Quantifying the Problem
Survey by sqlskills.com:
Root causes of the last few SQL

Server performance problems:
27% T-SQL
1...
Quantifying the Problem
Survey by sqlskills.com:
Root causes of the last few SQL

Server performance problems:
27% T-SQL
1...
Quantifying the Problem
Survey by sqlskills.com:
Root causes of the last few SQL

Server performance problems:
27% T-SQL
1...
Quantifying the Problem
Craig S. Mullins (strategist and researcher):
„As much as 75% of poor relational performance

is c...
Quantifying the Problem
Craig S. Mullins (strategist and researcher):
„As much as 75% of poor relational performance

is c...
Quantifying the Problem
My observation:
Quantifying the Problem
My observation:
~50% of SQL performance problems

are caused by improper index use
© 2017 by Markus Winand
The Root Cause
© 2017 by Markus Winand
The Root Cause
Indexing is a afterthought…
© 2017 by Markus Winand
The Root Cause
Indexing is a afterthought…
...often done by the wrong people
The Root Cause: DBAs are Indexing
How did databases

work before SQL?
The Root Cause: DBAs are Indexing
Index use was intrinsically

tied to the queries.
The Root Cause: DBAs are Indexing
Example: dBase
The Root Cause: DBAs are Indexing
Example: dBase
Developers had to...
...use indexes explicitly when searching:

set	index...
The Root Cause: DBAs are Indexing
Example: dBase
Developers had to...
...use indexes explicitly when searching:

set	index...
The Root Cause: DBAs are Indexing
SQL is an abstraction that only
defines the logical view.
The actual SQL implementation
t...
The Root Cause: DBAs are Indexing
Transaction
Constraint
Views
Tables
Data
manipulation
Queries
SQL (language)
has:
SQL Da...
The Root Cause: DBAs are Indexing
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transaction
Constr...
The Root Cause: DBAs are Indexing
Indexes
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactio...
The Root Cause: DBAs are Indexing
Indexes
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactio...
The Root Cause: DBAs are Indexing
Today, indexing is considered a
tuning task that belongs to the
administrators responsib...
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
DBAs don’t know

the queries
Have to “investig...
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
DBAs don’t know

the queries
Have to “investig...
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
DBAs don’t know

the queries
Have to “investig...
© 2017 by Markus Winand
The Solution
© 2017 by Markus Winand
The Solution
Indexing is a

Development Task
The Solution: It’s a Dev Task
Indexes
Backup
& recovery
Storage
management
Tuning
parameters
Transaction
Constraint
Views
...
The Solution: It’s a Dev Task
Indexes
Backup
& recovery
Storage
management
Tuning
parameters
Transaction
Constraint
Views
...
Another Problem: It’s not Taught
Indexes are not part of the pure SQL (language) literature
because indexes are not part o...
Another Problem: It’s not Taught
Proper index usage is sometimes covered in database
tuning books but is always buried bet...
Another Problem: It’s not Taught
Consequence:

Developers don’t know how to use
indexes properly.
Results of the 3-minute ...
Q1: Good or Bad? (Function use)
CREATE	INDEX	tbl_idx	ON	tbl	(date_column);

SELECT	text,	date_column

		FROM	tbl

	WHERE	T...
Q1: Good or Bad? (Function use)
CREATE	INDEX	tbl_idx	ON	tbl	(date_column);

SELECT	text,	date_column

		FROM	tbl

	WHERE	T...
3-Minute Quiz: Indexing Skills
...WHERE	TO_CHAR(date_column,	'YYYY')	=	'2017';	
Seq	Scan	on	tbl	(rows=365)	
			Rows	Remove...
3-Minute Quiz: Indexing Skills
...WHERE	TO_CHAR(date_column,	'YYYY')	=	'2017';	
Seq	Scan	on	tbl	(rows=365)	
			Rows	Remove...
3-Minute Quiz: Q1 — Results
3-Minute Quiz: Q1 — Results
67.5% 100%
n=24943
0%
3-Minute Quiz: Q1 — Results
50/50 chance
67.5% 100%
n=24943
0%
3-Minute Quiz: Q1 — Results
MySQL
Oracle Database
PostgreSQL
SQL Server
-12%
+12%
+16%
-0.5%
n=9144
n=4466
n=3283
n=8050
4...
3-Minute Quiz: Indexing Skills
Q2: Good or Bad? (Indexed Top-N, no IOS)
CREATE	INDEX	tbl_idx	ON	tbl	(a,	date_col);

SELECT...
3-Minute Quiz: Indexing Skills
Q2: Good or Bad? (Indexed Top-N, no IOS)
CREATE	INDEX	tbl_idx	ON	tbl	(a,	date_col);

SELECT...
3-Minute Quiz: Indexing Skills
It is already the most optimal solution

(not considering index-only scan).
	Limit	(rows=1)...
3-Minute Quiz: Q2 — Results
50/50 chance
53.7% 100%
n=24943
0%
Understandable
controversy!
3-Minute Quiz: Q2 — Results
50/50 chance
53.7% 100%
n=24943
0%
Understandable
controversy!
MySQL
Oracle Database
PostgreSQ...
3-Minute Quiz: Indexing Skills
Q3: Good or Bad? (Column order)
CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b FROM tb...
3-Minute Quiz: Indexing Skills
Q3: Good or Bad? (Column order)
CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b FROM tb...
3-Minute Quiz: Indexing Skills
As-is only one query can use the index (a,b):
...WHERE a = ? AND b = ?;
		Bitmap	Heap	Scan	...
3-Minute Quiz: Indexing Skills
Change the index to (b, a) so both can use it:
...WHERE a = ? AND b = ?;
		Bitmap	Heap	Scan...
3-Minute Quiz: Indexing Skills
Change the index to (b, a) so both can use it:
...WHERE a = ? AND b = ?;
		Bitmap	Heap	Scan...
50/50 chance
62.7% 100%
n=24943
0%
3-Minute Quiz: Q3 — Results
50/50 chance
62.7% 100%
n=24943
0%
3-Minute Quiz: Q3 — Results
MySQL
Oracle Database
PostgreSQL
SQL Server
1.1%
-2.1%
+4.6...
3-Minute Quiz: Indexing Skills
Q4: Good or Bad? (Indexing LIKE)
CREATE INDEX tbl_idx
ON tbl (text varchar_pattern_ops);
SE...
3-Minute Quiz: Indexing Skills
Q4: Good or Bad? (Indexing LIKE)
CREATE INDEX tbl_idx
ON tbl (text varchar_pattern_ops);
SE...
3-Minute Quiz: Indexing Skills
B-Tree indexes don’t support prefix wildcards.
	Seq	Scan	on	tbl	(rows=0)	
			Rows	Removed	by...
3-Minute Quiz: Indexing Skills
B-Tree indexes don’t support prefix wildcards.
	Seq	Scan	on	tbl	(rows=0)	
			Rows	Removed	by...
50/50 chance
81.9% 100%
n=24943
0%
3-Minute Quiz: Q4 — Results
50/50 chance
81.9% 100%
n=24943
0%
3-Minute Quiz: Q4 — Results
MySQL
Oracle Database
PostgreSQL
SQL Server
-0.4%
+0.6%
-2....
3-Minute Quiz: Indexing Skills
Q5: Good or Bad? (equality vs. ranges)
CREATE INDEX tbl_idx
ON tbl (date_col, state);
SELEC...
3-Minute Quiz: Indexing Skills
Q5: Good or Bad? (equality vs. ranges)
CREATE INDEX tbl_idx
ON tbl (date_col, state);
SELEC...
3-Minute Quiz: Indexing Skills
For the curious, the data distribution:
		SELECT	count(*)	FROM	tbl...	
	...	WHERE	date_colu...
3-Minute Quiz: Indexing Skills
CREATE INDEX tbl_idx
ON tbl (date_col, state);
	Index	Scan	using	tbl_idx	on	tbl	(rows=365)	...
3-Minute Quiz: Indexing Skills
CREATE INDEX tbl_idx
ON tbl (date_col, state);
	Index	Scan	using	tbl_idx	on	tbl	(rows=365)	...
42.3% 100%
n=3283
0%
3-Minute Quiz: Q5 — Results
42.3% 100%
n=3283
0%
3-Minute Quiz: Q5 — Results
No segregated
data!
(Other DBs ask about index-only scan, but PostgreSQL
...
Indexes: The Neglected All-Rounder
Everybody knows indexing is
important for performance,
yet nobody takes the time to
lea...
Indexes: The Neglected All-Rounder
Index details are hardly known.
➡ “Details” like column-order or equality vs. range

co...
Indexes: The Neglected All-Rounder
Index details are hardly known.
➡ “Details” like column-order or equality vs. range

co...
Indexes: The Neglected All-Rounder
Index details are hardly known.
➡ “Details” like column-order or equality vs. range

co...
Indexes: The Neglected All-Rounder
Are you just adding indexes
or
are you designing indexes?
About Markus Winand
Tuning developers for
high SQL performance
Training & co (one-man show):
winand.at
Author of:
SQL Perf...
About Markus Winand
use-the-index-luke.com
Nächste SlideShare
Wird geladen in …5
×

Indexes: The neglected performance all rounder

21.266 Aufrufe

Veröffentlicht am

The latest incarnation of this talk as presented on Jan, 31st at FOSDEM PgDay in Brussles

Veröffentlicht in: Technologie, Business

Indexes: The neglected performance all rounder

  1. 1. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  2. 2. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  3. 3. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  4. 4. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  5. 5. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  6. 6. http://www.commitstrip.com/en/2014/06/03/the-problem-is-not-the-tool-itself/
  7. 7. Takeaway #1: Pandemic Scale It affects you! (Symbolic image; not real data) http://upload.wikimedia.org/wikipedia/commons/c/c7/2009_world_subdivisions_flu_pandemic.png
  8. 8. Takeaway #2: Caused by Success Copyright © 2013 Telerik, Inc. All rights reserved
  9. 9. Takeaway #3: It’s Not Your Fault http://simpsonswiki.com/wiki/File:I_Didn%27t_Do_It!_Volume_III.png
  10. 10. © 2017 by Markus Winand The Problem Index/Query Mismatch
  11. 11. The Problem: Index/Query Mismatch “A very common cause of performance problems is lack of proper indexes or the use of queries that are not using existing indexes.” —Buda Consultinghttp://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
  12. 12. The Problem: Index/Query Mismatch “A very common cause of performance problems is lack of proper indexes or the use of queries that are not using existing indexes.” —Buda Consultinghttp://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
  13. 13. Quantifying the Problem
  14. 14. Quantifying the Problem Percona White Paper: Reasons of performance problems
 that caused production downtime: 38% bad SQL 15% schema and indexing http://www.percona.com/files/white-papers/causes-of-downtime-in-mysql.pdf
  15. 15. Quantifying the Problem Percona White Paper: Reasons of performance problems
 that caused production downtime: 38% bad SQL 15% schema and indexing http://www.percona.com/files/white-papers/causes-of-downtime-in-mysql.pdf
  16. 16. Quantifying the Problem Percona White Paper: Reasons of performance problems
 that caused production downtime: 38% bad SQL 15% schema and indexing http://www.percona.com/files/white-papers/causes-of-downtime-in-mysql.pdf
  17. 17. Quantifying the Problem Survey by sqlskills.com: Root causes of the last few SQL
 Server performance problems: 27% T-SQL 19% Poor indexing http://www.sqlskills.com/blogs/paul/survey-what-are-the-most-common-causes-of-performance-problems/
  18. 18. Quantifying the Problem Survey by sqlskills.com: Root causes of the last few SQL
 Server performance problems: 27% T-SQL 19% Poor indexing http://www.sqlskills.com/blogs/paul/survey-what-are-the-most-common-causes-of-performance-problems/
  19. 19. Quantifying the Problem Survey by sqlskills.com: Root causes of the last few SQL
 Server performance problems: 27% T-SQL 19% Poor indexing http://www.sqlskills.com/blogs/paul/survey-what-are-the-most-common-causes-of-performance-problems/
  20. 20. Quantifying the Problem Craig S. Mullins (strategist and researcher): „As much as 75% of poor relational performance
 is caused by "bad" SQL and application code.” Noel Yuhanna (Forrester Research): „The key difficulties surrounding performance
 continue to be poorly written SQL statements,
 improper DBMS configuration and a lack of clear understanding of how to tune databases to solve performance issues.”
  21. 21. Quantifying the Problem Craig S. Mullins (strategist and researcher): „As much as 75% of poor relational performance
 is caused by "bad" SQL and application code.” Noel Yuhanna (Forrester Research): „The key difficulties surrounding performance
 continue to be poorly written SQL statements,
 improper DBMS configuration and a lack of clear understanding of how to tune databases to solve performance issues.”
  22. 22. Quantifying the Problem My observation:
  23. 23. Quantifying the Problem My observation: ~50% of SQL performance problems
 are caused by improper index use
  24. 24. © 2017 by Markus Winand The Root Cause
  25. 25. © 2017 by Markus Winand The Root Cause Indexing is a afterthought…
  26. 26. © 2017 by Markus Winand The Root Cause Indexing is a afterthought… ...often done by the wrong people
  27. 27. The Root Cause: DBAs are Indexing How did databases
 work before SQL?
  28. 28. The Root Cause: DBAs are Indexing Index use was intrinsically
 tied to the queries.
  29. 29. The Root Cause: DBAs are Indexing Example: dBase
  30. 30. The Root Cause: DBAs are Indexing Example: dBase Developers had to... ...use indexes explicitly when searching:
 set index to last_name
 find Winand
  31. 31. The Root Cause: DBAs are Indexing Example: dBase Developers had to... ...use indexes explicitly when searching:
 set index to last_name
 find Winand ...take care of index maintenance:
 set index to last_name, idx2
 append
  32. 32. The Root Cause: DBAs are Indexing SQL is an abstraction that only defines the logical view. The actual SQL implementation takes care of everything else.
  33. 33. The Root Cause: DBAs are Indexing Transaction Constraint Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have:
  34. 34. The Root Cause: DBAs are Indexing Backup & recovery Storage management Bugs & patches Tuning parameters Transaction Constraint Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have: High Availability
  35. 35. The Root Cause: DBAs are Indexing Indexes Backup & recovery Storage management Bugs & patches Tuning parameters Transaction Constraint Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have: High Availability
  36. 36. The Root Cause: DBAs are Indexing Indexes Backup & recovery Storage management Bugs & patches Tuning parameters Transaction Constraint Views Tables Data manipulation Queries Developers Administrators High Availability
  37. 37. The Root Cause: DBAs are Indexing Today, indexing is considered a tuning task that belongs to the administrators responsibilities.
  38. 38. The Root Cause: DBAs are Indexing A misconception that causes new problems:
  39. 39. The Root Cause: DBAs are Indexing A misconception that causes new problems: DBAs don’t know
 the queries Have to “investigate” to find the queries. by G-10gian82 deviantart.com
  40. 40. The Root Cause: DBAs are Indexing A misconception that causes new problems: DBAs don’t know
 the queries Have to “investigate” to find the queries. It is time consuming and almost always incomplete. by G-10gian82 deviantart.com
  41. 41. The Root Cause: DBAs are Indexing A misconception that causes new problems: DBAs don’t know
 the queries Have to “investigate” to find the queries. It is time consuming and almost always incomplete. DBAs can’t change
 the queries Can make the index
 match the query. Can’t make the query
 match the index!
  42. 42. © 2017 by Markus Winand The Solution
  43. 43. © 2017 by Markus Winand The Solution Indexing is a
 Development Task
  44. 44. The Solution: It’s a Dev Task Indexes Backup & recovery Storage management Tuning parameters Transaction Constraint Views Tables Data manipulation Queries Developers Administrators High Availability Bugs & patches
  45. 45. The Solution: It’s a Dev Task Indexes Backup & recovery Storage management Tuning parameters Transaction Constraint Views Tables Data manipulation Queries Developers Administrators Must match! High Availability Bugs & patches
  46. 46. Another Problem: It’s not Taught Indexes are not part of the pure SQL (language) literature because indexes are not part of the SQL standard. 11 SQL books analyzed: only 1.0% of the pages are about indexes (70 out of 7330 pages). Examples:
 Oracle SQL by Example: 2.0% (19/960)
 Beginning DBs with PostgreSQL: 0.8% (5/664)
 Learning SQL: 3.3% (11/336 — highest rate in class)
  47. 47. Another Problem: It’s not Taught Proper index usage is sometimes covered in database tuning books but is always buried between hundreds of pages of HW, OS and DB parameterization topics. 14 database administration books analyzed: 6% of the pages are about indexes (395 out of 6568 pages). Examples:
 Oracle Performance Survival Guide: 5.2% (38/730)
 High Performance MySQL: 8% (55/684)
 PostgreSQL 9 High Performance: 5.8% (27/468)
 SQL Server 2012 Query Perf. Tuning: 19.6% (98/499)
  48. 48. Another Problem: It’s not Taught Consequence:
 Developers don’t know how to use indexes properly. Results of the 3-minute online quiz:
 http://use-the-index-luke.com/3-minute-test
 5 questions: each about a specific index
 usage pattern.
 Non-representative!
  49. 49. Q1: Good or Bad? (Function use) CREATE INDEX tbl_idx ON tbl (date_column);
 SELECT text, date_column
 FROM tbl
 WHERE TO_CHAR(date_column, 'YYYY') = '2017'; 3-Minute Quiz: Indexing Skills
  50. 50. Q1: Good or Bad? (Function use) CREATE INDEX tbl_idx ON tbl (date_column);
 SELECT text, date_column
 FROM tbl
 WHERE TO_CHAR(date_column, 'YYYY') = '2017'; 3-Minute Quiz: Indexing Skills http://use-the-index-luke.com/sql/where-clause/obfuscation/dates
  51. 51. 3-Minute Quiz: Indexing Skills ...WHERE TO_CHAR(date_column, 'YYYY') = '2017'; Seq Scan on tbl (rows=365) Rows Removed by Filter: 49635 Total runtime: 118.796 ms ...WHERE date_column BETWEEN '2017-01-01'
 AND '2017-12-31' Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.430 ms (Above: simplified PostgreSQL execution plans when selecting 365 rows out of 50000)
  52. 52. 3-Minute Quiz: Indexing Skills ...WHERE TO_CHAR(date_column, 'YYYY') = '2017'; Seq Scan on tbl (rows=365) Rows Removed by Filter: 49635 Total runtime: 118.796 ms ...WHERE date_column BETWEEN '2017-01-01'
 AND '2017-12-31' Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.430 ms (Above: simplified PostgreSQL execution plans when selecting 365 rows out of 50000) 270x faster
  53. 53. 3-Minute Quiz: Q1 — Results
  54. 54. 3-Minute Quiz: Q1 — Results 67.5% 100% n=24943 0%
  55. 55. 3-Minute Quiz: Q1 — Results 50/50 chance 67.5% 100% n=24943 0%
  56. 56. 3-Minute Quiz: Q1 — Results MySQL Oracle Database PostgreSQL SQL Server -12% +12% +16% -0.5% n=9144 n=4466 n=3283 n=8050 47.5% 87.5%67.5% 50/50 chance 67.5% 100% n=24943 0%
  57. 57. 3-Minute Quiz: Indexing Skills Q2: Good or Bad? (Indexed Top-N, no IOS) CREATE INDEX tbl_idx ON tbl (a, date_col);
 SELECT id, a, date_col
 FROM tbl
 WHERE a = ?
 ORDER BY date_col DESC
 LIMIT 1; http://use-the-index-luke.com/sql/partial-results/top-n-queries
  58. 58. 3-Minute Quiz: Indexing Skills Q2: Good or Bad? (Indexed Top-N, no IOS) CREATE INDEX tbl_idx ON tbl (a, date_col);
 SELECT id, a, date_col
 FROM tbl
 WHERE a = ?
 ORDER BY date_col DESC
 LIMIT 1; http://use-the-index-luke.com/sql/partial-results/top-n-queries
  59. 59. 3-Minute Quiz: Indexing Skills It is already the most optimal solution
 (not considering index-only scan). Limit (rows=1) -> Index Scan Backward using tbl_idx on tbl (rows=1) Index Cond: (a = 123::numeric) Total runtime: 0.053 ms As fast as a primary key lookup because it can never return more than one row.
  60. 60. 3-Minute Quiz: Q2 — Results 50/50 chance 53.7% 100% n=24943 0% Understandable controversy!
  61. 61. 3-Minute Quiz: Q2 — Results 50/50 chance 53.7% 100% n=24943 0% Understandable controversy! MySQL Oracle Database PostgreSQL SQL Server -3% -6% +8.9% 3% n=9144 n=4466 n=3283 n=8050 33.7% 73.7%53.7%
  62. 62. 3-Minute Quiz: Indexing Skills Q3: Good or Bad? (Column order) CREATE INDEX tbl_idx ON tbl (a, b); SELECT id, a, b FROM tbl WHERE a = ? AND b = ?; SELECT id, a, b FROM tbl WHERE b = ?;
  63. 63. 3-Minute Quiz: Indexing Skills Q3: Good or Bad? (Column order) CREATE INDEX tbl_idx ON tbl (a, b); SELECT id, a, b FROM tbl WHERE a = ? AND b = ?; SELECT id, a, b FROM tbl WHERE b = ?; http://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys
  64. 64. 3-Minute Quiz: Indexing Skills As-is only one query can use the index (a,b): ...WHERE a = ? AND b = ?; Bitmap Heap Scan on tbl (rows=6) -> Bitmap Index Scan on tbl_idx (rows=6) Index Cond: ((a = 123) AND (b = 1)) Total runtime: 0.055 ms ...WHERE b = ?; Seq Scan on tbl (rows=5142) Rows Removed by Filter: 44858 Total runtime: 29.849 ms
  65. 65. 3-Minute Quiz: Indexing Skills Change the index to (b, a) so both can use it: ...WHERE a = ? AND b = ?; Bitmap Heap Scan on tbl (rows=6) -> Bitmap Index Scan on tbl_idx (rows=6) Index Cond: ((a = 123) AND (b = 1)) Total runtime: 0.056 ms ...WHERE b = ?; Bitmap Heap Scan on tbl (rows=5142) -> Bitmap Index Scan on tbl_idx (rows=5142) Index Cond: (b = 1::numeric) Total runtime: 6.932 ms
  66. 66. 3-Minute Quiz: Indexing Skills Change the index to (b, a) so both can use it: ...WHERE a = ? AND b = ?; Bitmap Heap Scan on tbl (rows=6) -> Bitmap Index Scan on tbl_idx (rows=6) Index Cond: ((a = 123) AND (b = 1)) Total runtime: 0.056 ms ...WHERE b = ?; Bitmap Heap Scan on tbl (rows=5142) -> Bitmap Index Scan on tbl_idx (rows=5142) Index Cond: (b = 1::numeric) Total runtime: 6.932 ms 4x faster
  67. 67. 50/50 chance 62.7% 100% n=24943 0% 3-Minute Quiz: Q3 — Results
  68. 68. 50/50 chance 62.7% 100% n=24943 0% 3-Minute Quiz: Q3 — Results MySQL Oracle Database PostgreSQL SQL Server 1.1% -2.1% +4.6% -1.9% n=9144 n=4466 n=3283 n=8050 42.7% 82.7%62.7%
  69. 69. 3-Minute Quiz: Indexing Skills Q4: Good or Bad? (Indexing LIKE) CREATE INDEX tbl_idx ON tbl (text varchar_pattern_ops); SELECT id, text FROM tbl WHERE text LIKE '%TERM%';
  70. 70. 3-Minute Quiz: Indexing Skills Q4: Good or Bad? (Indexing LIKE) CREATE INDEX tbl_idx ON tbl (text varchar_pattern_ops); SELECT id, text FROM tbl WHERE text LIKE '%TERM%'; http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning
  71. 71. 3-Minute Quiz: Indexing Skills B-Tree indexes don’t support prefix wildcards. Seq Scan on tbl (rows=0) Rows Removed by Filter: 50000 Total runtime: 23.494 ms Using a special purpose index (e.g. in PostgreSQL): CREATE INDEX tbl_tgrm ON tbl USING gin (text gin_trgm_ops); Bitmap Heap Scan on tbl (rows=0) Rows Removed by Index Recheck: 2 -> Bitmap Index Scan on tbl_tgrm (rows=2) Index Cond: ((text)::text ~~ '%TERM%'::text) Total runtime: 0.114 ms
  72. 72. 3-Minute Quiz: Indexing Skills B-Tree indexes don’t support prefix wildcards. Seq Scan on tbl (rows=0) Rows Removed by Filter: 50000 Total runtime: 23.494 ms Using a special purpose index (e.g. in PostgreSQL): CREATE INDEX tbl_tgrm ON tbl USING gin (text gin_trgm_ops); Bitmap Heap Scan on tbl (rows=0) Rows Removed by Index Recheck: 2 -> Bitmap Index Scan on tbl_tgrm (rows=2) Index Cond: ((text)::text ~~ '%TERM%'::text) Total runtime: 0.114 ms 200x faster
  73. 73. 50/50 chance 81.9% 100% n=24943 0% 3-Minute Quiz: Q4 — Results
  74. 74. 50/50 chance 81.9% 100% n=24943 0% 3-Minute Quiz: Q4 — Results MySQL Oracle Database PostgreSQL SQL Server -0.4% +0.6% -2.6% +1.6% n=9144 n=4466 n=3283 n=8050 61.9% 101.9%81.9%
  75. 75. 3-Minute Quiz: Indexing Skills Q5: Good or Bad? (equality vs. ranges) CREATE INDEX tbl_idx ON tbl (date_col, state); SELECT id, date_col, state FROM tbl WHERE date_col >= TO_DATE(‘2008-11-07’) AND state = 'X';
  76. 76. 3-Minute Quiz: Indexing Skills Q5: Good or Bad? (equality vs. ranges) CREATE INDEX tbl_idx ON tbl (date_col, state); SELECT id, date_col, state FROM tbl WHERE date_col >= TO_DATE(‘2008-11-07’) AND state = 'X'; http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/greater-less-between-tuning-sql-access-filter-predicates
  77. 77. 3-Minute Quiz: Indexing Skills For the curious, the data distribution: SELECT count(*) FROM tbl... ... WHERE date_column >= TO_DATE(‘2012-11-07’); --> 1826 ... WHERE state = 'X'; --> 10000 ... WHERE date_column >= TO_DATE(‘2012-11-07’) AND state = 'X'; --> 365
  78. 78. 3-Minute Quiz: Indexing Skills CREATE INDEX tbl_idx ON tbl (date_col, state); Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.893 ms CREATE INDEX tbl_idx ON tbl (state, date_col); Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.530 ms
  79. 79. 3-Minute Quiz: Indexing Skills CREATE INDEX tbl_idx ON tbl (date_col, state); Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.893 ms CREATE INDEX tbl_idx ON tbl (state, date_col); Index Scan using tbl_idx on tbl (rows=365) Total runtime: 0.530 ms ~70% faster
  80. 80. 42.3% 100% n=3283 0% 3-Minute Quiz: Q5 — Results
  81. 81. 42.3% 100% n=3283 0% 3-Minute Quiz: Q5 — Results No segregated data! (Other DBs ask about index-only scan, but PostgreSQL
 didn't support them at the time the quiz was designed)
  82. 82. Indexes: The Neglected All-Rounder Everybody knows indexing is important for performance, yet nobody takes the time to learn and apply is properly.
  83. 83. Indexes: The Neglected All-Rounder Index details are hardly known. ➡ “Details” like column-order or equality vs. range
 conditions must be learned and understood.
  84. 84. Indexes: The Neglected All-Rounder Index details are hardly known. ➡ “Details” like column-order or equality vs. range
 conditions must be learned and understood. Only one index capability is used: finding data quickly ➡ Indexes have three capabilities (powers): 
 finding data, clustering data, and sorting data.
  85. 85. Indexes: The Neglected All-Rounder Index details are hardly known. ➡ “Details” like column-order or equality vs. range
 conditions must be learned and understood. Only one index capability is used: finding data quickly ➡ Indexes have three capabilities (powers): 
 finding data, clustering data, and sorting data. Indexing is done from single query perspective. ➡ Should be done from application perspective (considering all queries). It’s a design task!
  86. 86. Indexes: The Neglected All-Rounder Are you just adding indexes or are you designing indexes?
  87. 87. About Markus Winand Tuning developers for high SQL performance Training & co (one-man show): winand.at Author of: SQL Performance Explained Geeky blog: use-the-index-luke.com
  88. 88. About Markus Winand use-the-index-luke.com

×