SlideShare ist ein Scribd-Unternehmen logo
1 von 89
Downloaden Sie, um offline zu lesen
Optimizer Histograms
When they Help and When Do Not?
February, 01, 2019
Sveta Smirnova
• MySQL Support engineer
• Author of
• MySQL Troubleshooting
• JSON UDF functions
• FILTER clause for MySQL
• Speaker
• Percona Live, OOW, Fosdem,
DevConf, HighLoad...
Sveta Smirnova
2
•Why do I Care?
•The Use Case
•Even Worse Use Case
•Why the Difference?
•How Histograms Work?
Table of Contents
3
The column statistics data dictionary table stores histogram statistics about
column values, for use by the optimizer in constructing query execution plans
MySQL User Reference Manual
Optimizer Statistics aka Histograms
4
Why do I Care?
• Data distribution vary
•
Big difference between number of values
•
Costantly changing
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Was not updated in time
•
Updates too often
• Calculated wrongly
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Index maintenance costs a lot
• Hardware resources
•
Slow updates
• Window to run CREATE INDEX
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Index maintenance costs a lot
• Optimizer does not work as we wish to
Examples in my talk @Percona Live
Latest Support Tickets
6
• Topic based on real Support cases
•
Couple of them are still in progress
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
•
They created such that
• No customer can be identified
• Everything generated
Table names
Column names
Data
• Use case itself is fictional
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• Only columns, required to show the issue
•
Everything extra removed
• Real tables usually store much more data
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• All disasters happened with version 5.7
Disclaimer
7
The Use Case
•
categories
• Less than 20 rows
Two tables
9
•
categories
• Less than 20 rows
• goods
• More than 1M rows
• 20 unique cat id values
• Many other fields
Price
Date: added, last updated, etc.
Characteristics
Store
...
Two tables
9
select *
from
goods
join
categories
on
(categories.id=goods.cat_id)
where
date_added between ’2018-07-01’ and ’2018-08-01’
and
cat_id in (16,11)
and
price >= 1000 and <=10000 [ and ... ]
[ GROUP BY ... [ORDER BY ... [ LIMIT ...]]]
;
JOIN
10
• Select from the Small Table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
• Slow with many items in the category
Option 1: Select from the Small Table First
11
• Filter rows by date added[ and price[...]]
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
• Slow if number of rows, filtered by
date added, is larger than number of goods in
the selected categories
Option 2: Select from the Large Table First
12
•
CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
What if use Combined Indexes?
13
•
CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
• But not in all cases
What if use Combined Indexes?
13
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
The Problem
14
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
• Index not useful for selecting rows
JOIN categories ON (categories.id=goods.cat_id)
JOIN shops ON (shops.id=goods.shop_id)
[ JOIN ... ]
WHERE
date_added between ’2018-07-01’ and ’2018-08-01’
AND
cat_id in (16,11) AND price >= 1000 AND price <=10000 [ AND ... ]
GROUP BY product_type
ORDER BY date_updated DESC
LIMIT 50,100
The Problem
14
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
• Index not useful for selecting rows
• Tables may have wrong cardinality
The Problem
14
• EXPLAIN without histograms
mysql> explain select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range
-> order by goods.cat_id
-> limit 10G -- We ask for 10 rows only!
Example
15
• EXPLAIN without histograms
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table first
partitions: NULL
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 20
filtered: 70.00
Extra: Using where; Using index;
Using temporary; Using filesort
Example
15
• EXPLAIN without histograms
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table
partitions: NULL
type: ref
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: orig.categories.id
rows: 51827
filtered: 11.11 -- Default value
Extra: Using where
2 rows in set, 1 warning (0.01 sec)
Example
15
• Execution time without histograms
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
ab9f9bb7bc4f357712ec34f067eda364 -
10 rows in set (56.47 sec)
Example
15
• Engine statistics without histograms
mysql> show status like ’Handler%’;
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
...
| Handler_read_next | 964718 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 10 |
| Handler_read_rnd_next | 951671 |
...
| Handler_write | 951670 |
+----------------------------+--------+
18 rows in set (0.01 sec)
Example
15
• Now lets add the histogram
mysql> analyze table goods update histogram on date_added;
+------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------+-----------+----------+------------------------------+
| orig.goods | histogram | status | Histogram statistics created
for column ’date_added’. |
+------------+-----------+----------+------------------------------+
1 row in set (2.01 sec)
Example
15
• EXPLAIN with the histogram
mysql> explain select goods.* from goods
-> join categories
-> on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10G
Example
15
• EXPLAIN with the histogram
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table first
partitions: NULL
type: index
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: NULL
rows: 10 -- Same as we asked
filtered: 98.70 -- True numbers
Extra: Using where
Example
15
• EXPLAIN with the histogram
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: orig.goods.cat_id
rows: 1
filtered: 100.00
Extra: Using index
2 rows in set, 1 warning (0.01 sec)
Example
15
• Execution time with the histogram
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
eeb005fae0dd3441c5c380e1d87fee84 -
10 rows in set (0.00 sec) -- 56/0 times faster!
Example
15
• Engine statistics with the histogram
mysql> show status like ’Handler%’;
+----------------------------+-------++----------------------------+-------+
| Variable_name | Value || Variable_name | Value |
+----------------------------+-------++----------------------------+-------+
| Handler_commit | 1 || Handler_read_prev | 0 |
| Handler_delete | 0 || Handler_read_rnd | 0 |
| Handler_discover | 0 || Handler_read_rnd_next | 0 |
| Handler_external_lock | 4 || Handler_rollback | 0 |
| Handler_mrr_init | 0 || Handler_savepoint | 0 |
| Handler_prepare | 0 || Handler_savepoint_rollback | 0 |
| Handler_read_first | 1 || Handler_update | 0 |
| Handler_read_key | 3 || Handler_write | 0 |
| Handler_read_last | 0 |+----------------------------+-------+
| Handler_read_next | 9 |18 rows in set (0.00 sec)
Example
15
Even Worse Use Case
•
goods characteristics
CREATE TABLE ‘goods_characteristics‘ (
‘id‘ int(11) NOT NULL AUTO_INCREMENT,
‘good_id‘ varchar(30) DEFAULT NULL,
‘size‘ int(11) DEFAULT NULL,
‘manufacturer‘ varchar(30) DEFAULT NULL,
PRIMARY KEY (‘id‘),
KEY ‘good_id‘ (‘good_id‘,‘size‘,‘manufacturer‘),
KEY ‘size‘ (‘size‘,‘manufacturer‘)
) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Two Similar Tables
17
•
goods shops
CREATE TABLE ‘goods_shops‘ (
‘id‘ int(11) NOT NULL AUTO_INCREMENT,
‘good_id‘ varchar(30) DEFAULT NULL,
‘location‘ varchar(30) DEFAULT NULL,
‘delivery_options‘ varchar(30) DEFAULT NULL,
PRIMARY KEY (‘id‘),
KEY ‘good_id‘ (‘good_id‘,‘location‘,‘delivery_options‘),
KEY ‘location‘ (‘location‘,‘delivery_options‘)
) ENGINE=InnoDB AUTO_INCREMENT=131071 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Two Similar Tables
17
• Size
mysql> select count(*) from goods_characteristics;
+----------+
| count(*) |
+----------+
| 131072 |
+----------+
1 row in set (0.08 sec)
mysql> select count(*) from goods_shops;
+----------+
| count(*) |
+----------+
| 65536 |
+----------+
1 row in set (0.04 sec)
Two Similar Tables
17
• Data Distribution: goods characteristics
mysql> select count(*) num_rows, good_id, size
-> from goods_characteristics group by good_id, size;
+----------+---------+------+
| num_rows | good_id | size |
+----------+---------+------+
| 65536 | laptop | 7 |
| 8187 | laptop | 8 |
| 8190 | laptop | 9 |
| 8188 | laptop | 10 |
| 8192 | laptop | 11 |
| 8189 | laptop | 12 |
| 8189 | laptop | 13 |
| 8191 | laptop | 14 |
| 8190 | laptop | 15 |
| 10 | laptop | 16 |
| 10 | laptop | 17 |
+----------+---------+------+
Two Similar Tables
17
• Data Distribution: goods characteristics
mysql> select count(*) num_rows, good_id, manufacturer
-> from goods_characteristics group by good_id, manufacturer order by num_rows desc;
+----------+---------+--------------+
| num_rows | good_id | manufacturer |
+----------+---------+--------------+
| 65536 | laptop | Noname |
| 8191 | laptop | Samsung |
| 8191 | laptop | Acer |
| 8189 | laptop | Dell |
| 8189 | laptop | HP |
| 8189 | laptop | Lenovo |
| 8189 | laptop | Toshiba |
| 8189 | laptop | Apple |
| 8189 | laptop | Asus |
| 10 | laptop | Sony |
| 10 | laptop | Casper |
+----------+---------+--------------+
Two Similar Tables
17
• Data Distribution: goods shops
mysql> select count(*) num_rows, good_id, location
-> from goods_shops group by good_id, location order by num_rows desc;
+----------+---------+---------------+
| num_rows | good_id | location |
+----------+---------+---------------+
| 8191 | laptop | New York |
| 8191 | laptop | San Francisco |
| 8189 | laptop | Paris |
| 8189 | laptop | Berlin |
| 8189 | laptop | Brussels |
| 8189 | laptop | Tokio |
| 8189 | laptop | Istanbul |
| 8189 | laptop | London |
| 10 | laptop | Moscow |
| 10 | laptop | Kiev |
+----------+---------+---------------+
Two Similar Tables
17
• Data Distribution: goods shops
mysql> select count(*) num_rows, good_id, delivery_options
-> from goods_shops group by good_id, delivery_options order by num_rows desc;
+----------+---------+------------------+
| num_rows | good_id | delivery_options |
+----------+---------+------------------+
| 8192 | laptop | DHL |
| 8191 | laptop | PTT |
| 8190 | laptop | Normal Post |
| 8190 | laptop | Tracked |
| 8189 | laptop | Fedex |
| 8189 | laptop | Gruzovichkof |
| 8188 | laptop | Courier |
| 8187 | laptop | No delivery |
| 10 | laptop | Premium |
| 10 | laptop | Urgent |
+----------+---------+------------------+
Two Similar Tables
17
Histogram statistics are useful primarily for nonindexed columns. Adding an
index to a column for which histogram statistics are applicable might also help
the optimizer make row estimates. The tradeoffs are:
An index must be updated when table data is modified.
A histogram is created or updated only on demand, so it adds no overhead
when table data is modified. On the other hand, the statistics become progres-
sively more out of date when table modifications occur, until the next time they
are updated.
MySQL User Reference Manual
Optimizer Statistics aka Histograms
18
mysql> alter table goods_characteristics stats_sample_pages=5000;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table goods_shops stats_sample_pages=5000;
Query OK, 0 rows affected (0.05 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> analyze table goods_characteristics, goods_shops;
+----------------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+---------+----------+----------+
| test.goods_characteristics | analyze | status | OK |
| test.goods_shops | analyze | status | OK |
+----------------------------+---------+----------+----------+
2 rows in set (0.35 sec)
Index Statistics is More than Good
19
• The query
mysql> select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
^C^C -- query aborted
ERROR 1317 (70100): Query execution was interrupted
Performance?
20
• Handlers
mysql> show status like ’Handler%’;
+----------------------------+-------------+
| Variable_name | Value |
+----------------------------+-------------+
| Handler_commit | 0 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 4 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 13043 |
| Handler_read_last | 0 |
| Handler_read_next | 854,767,916 |
...
Performance?
20
• Table order
mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_characteristics | index | good_id | 131072 | 25.00 | Using where; Using index |
| 1 | goods_shops | ref | good_id | 65536 | 36.00 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Performance?
20
• Table order matters
mysql> explain select count(*) from goods_shops straight_join goods_characteristics
-> using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_shops | index | good_id | 65536 | 36.00 | Using where; Using index |
| 1 | goods_characteristics | ref | good_id | 131072 | 25.00 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Performance?
20
• Table order matters
mysql> select count(*) from goods_shops straight_join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----------+
| count(*) |
+----------+
| 816640 |
+----------+
1 row in set (2.11 sec)
mysql> show status like ’Handler_read_next’;
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Handler_read_next | 5,308,416 |
+-------------------+-----------+
1 row in set (0.00 sec)
Performance?
20
mysql> analyze table goods_shops update histogram on location, delivery_options;
+-------------+-----------+----------+-----------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-----------+----------+-----------------------------------------------------+
| goods_shops | histogram | status | Histogram statistics created... ’delivery_options’. |
| goods_shops | histogram | status | Histogram statistics created for column ’location’. |
+-------------+-----------+----------+-----------------------------------------------------+
2 rows in set (0.18 sec)
mysql> analyze table goods_characteristics update histogram on size, manufacturer ;
+-----------------------+-----------+----------+-------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------------------+-----------+----------+-------------------------------------------------+
| goods_characteristics | histogram | status | Histogram statistics created... ’manufacturer’. |
| goods_characteristics | histogram | status | Histogram statistics created for column ’size’. |
+-----------------------+-----------+----------+-------------------------------------------------+
2 rows in set (0.23 sec)
Histograms to Rescue
21
• The query
mysql> select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----------+
| count(*) |
+----------+
| 816640 |
+----------+
1 row in set (2.16 sec)
mysql> show status like ’Handler_read_next’;
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Handler_read_next | 5,308,418 |
+-------------------+-----------+
1 row in set (0.00 sec)
Histograms to Rescue
21
• Filtering effect
mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) where s
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_shops | index | good_id | 65536 | 0.06 | Using where; Using index |
| 1 | goods_characteristics | ref | good_id | 131072 | 15.63 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Histograms to Rescue
21
Why the Difference?
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Indexes: Number of Items with Same Value
23
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Indexes: Cardinality
24
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Histograms: Number of Values in Each Bucket
25
1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
Histograms: Data in the Histogram
26
How Histograms Work?
↓ sql/sql planner.cc
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
• get histogram selectivity
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
• get histogram selectivity
• Seen as a percent of filtered rows in EXPLAIN
Low Level
28
• Example data
mysql> create table example(f1 int) engine=innodb;
mysql> insert into example values(1),(1),(1),(2),(3);
mysql> select f1, count(f1) from example group by f1;
+------+-----------+
| f1 | count(f1) |
+------+-----------+
| 1 | 3 |
| 2 | 1 |
| 3 | 1 |
+------+-----------+
3 rows in set (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> analyze table example update histogram on f1 with 3 buckets;
+-----------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------------+-----------+----------+------------------------------+
| hist_ex.example | histogram | status | Histogram statistics created
for column ’f1’. |
+-----------------+-----------+----------+------------------------------+
1 row in set (0.03 sec)
Filtered Rows
29
• With the histogram
mysql> select * from information_schema.column_statistics
-> where table_name=’example’G
*************************** 1. row ***************************
SCHEMA_NAME: hist_ex
TABLE_NAME: example
COLUMN_NAME: f1
HISTOGRAM:
"buckets": [[1, 0.6], [2, 0.8], [3, 1.0]],
"data-type": "int", "null-values": 0.0, "collation-id": 8,
"last-updated": "2018-11-07 09:07:19.791470",
"sampling-rate": 1.0, "histogram-type": "singleton",
"number-of-buckets-specified": 3
1 row in set (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 100.00 -- all rows
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 40.00 -- 2 rows
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 -- one row
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 - one row
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
•
CREATE INDEX
• Metadata lock
•
Can be blocked by any query
Locking
30
•
CREATE INDEX
• Metadata lock
•
Can be blocked by any query
• UPDATE HISTOGRAM
• Backup lock
• Can be locked only by a backup
•
Can be created any time without fear
Locking
30
• Helps if query plan can be changed
• Not a replacement for the index:
•
GROUP BY
• ORDER BY
• Query on a single table ∗
Outcome
31
• Data distribution is uniform
• Range optimization can be used
• Full table scan is fast
When Histogram are not Helpful?
32
• Index statistics collected by the engine
• Optimizer calculates Cardinality each time
when accesses statistics
•
Indexes not always improve performance
• Histograms can help
Still new feature
• Histograms do not replace other optimizations!
Conclusion
33
MySQL User Reference Manual
Blog by Erik Froseth
Blog by Frederic Descamps
Talk by Oystein Grovlen @Fosdem
Talk by Sergei Petrunia @PerconaLive
WL #8707
More information
34
www.slideshare.net/SvetaSmirnova
twitter.com/svetsmirnova
github.com/svetasmirnova
Thank you!
35
DATABASE PERFORMANCE
MATTERS

Weitere ähnliche Inhalte

Was ist angesagt?

Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite PluginsSveta Smirnova
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysqlsabir18
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query TuningSveta Smirnova
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demoSveta Smirnova
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101Sveta Smirnova
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Executionwebhostingguy
 
Troubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsTroubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsSveta Smirnova
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON? Sveta Smirnova
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf TuningHighLoad2009
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014Mysql User Camp
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQLGeorgi Sotirov
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZENorvald Ryeng
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL ExplainMYXPLAIN
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced QueryingZohar Elkayam
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionSveta Smirnova
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
 

Was ist angesagt? (20)

Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysql
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query Tuning
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demo
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
 
Troubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsTroubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-ons
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON?
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL Explain
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced Querying
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 

Ähnlich wie Optimizer Histograms: When they Help and When Do Not?

5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...Kangaroot
 
Part2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsMaria Colgan
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksMYXPLAIN
 
SunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLSunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLGabriela Ferrara
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerDataStax
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesSergey Petrunya
 
Migration from mysql to elasticsearch
Migration from mysql to elasticsearchMigration from mysql to elasticsearch
Migration from mysql to elasticsearchRyosuke Nakamura
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfFederico Razzoli
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesSergey Petrunya
 
Presentation top tips for getting optimal sql execution
Presentation    top tips for getting optimal sql executionPresentation    top tips for getting optimal sql execution
Presentation top tips for getting optimal sql executionxKinAnx
 
DPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQLDPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQLGabriela Ferrara
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreMariaDB plc
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02promethius
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance ImprovementsRonald Bradford
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Sergey Petrunya
 

Ähnlich wie Optimizer Histograms: When they Help and When Do Not? (20)

5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
 
Part2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer Statistics
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
SunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLSunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQL
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
 
Quick Wins
Quick WinsQuick Wins
Quick Wins
 
Migration from mysql to elasticsearch
Migration from mysql to elasticsearchMigration from mysql to elasticsearch
Migration from mysql to elasticsearch
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdf
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
 
Presentation top tips for getting optimal sql execution
Presentation    top tips for getting optimal sql executionPresentation    top tips for getting optimal sql execution
Presentation top tips for getting optimal sql execution
 
DPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQLDPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQL
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
 
MySQL performance tuning
MySQL performance tuningMySQL performance tuning
MySQL performance tuning
 

Mehr von Sveta Smirnova

MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?Sveta Smirnova
 
Database in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringDatabase in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringSveta Smirnova
 
MySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveMySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveSveta Smirnova
 
MySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersMySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersSveta Smirnova
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOpsSveta Smirnova
 
MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговSveta Smirnova
 
MySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessMySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessSveta Smirnova
 
Производительность MySQL для DevOps
 Производительность MySQL для DevOps Производительность MySQL для DevOps
Производительность MySQL для DevOpsSveta Smirnova
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOpsSveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterSveta Smirnova
 
How to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsHow to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsSveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?Sveta Smirnova
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaSveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
 How Safe is Asynchronous Master-Master Setup? How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?Sveta Smirnova
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQLSveta Smirnova
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackSveta Smirnova
 

Mehr von Sveta Smirnova (17)

MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
 
Database in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringDatabase in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and Monitoring
 
MySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveMySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to Have
 
MySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersMySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for Developers
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации багов
 
MySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessMySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your Business
 
Производительность MySQL для DevOps
 Производительность MySQL для DevOps Производительность MySQL для DevOps
Производительность MySQL для DevOps
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
 
How to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsHow to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tears
 
How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
How Safe is Asynchronous Master-Master Setup?
 How Safe is Asynchronous Master-Master Setup? How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQL
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it Back
 

Kürzlich hochgeladen

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Kürzlich hochgeladen (20)

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

Optimizer Histograms: When they Help and When Do Not?

  • 1. Optimizer Histograms When they Help and When Do Not? February, 01, 2019 Sveta Smirnova
  • 2. • MySQL Support engineer • Author of • MySQL Troubleshooting • JSON UDF functions • FILTER clause for MySQL • Speaker • Percona Live, OOW, Fosdem, DevConf, HighLoad... Sveta Smirnova 2
  • 3. •Why do I Care? •The Use Case •Even Worse Use Case •Why the Difference? •How Histograms Work? Table of Contents 3
  • 4. The column statistics data dictionary table stores histogram statistics about column values, for use by the optimizer in constructing query execution plans MySQL User Reference Manual Optimizer Statistics aka Histograms 4
  • 5. Why do I Care?
  • 6. • Data distribution vary • Big difference between number of values • Costantly changing Latest Support Tickets 6
  • 7. • Data distribution vary • Cardinality is not correct • Was not updated in time • Updates too often • Calculated wrongly Latest Support Tickets 6
  • 8. • Data distribution vary • Cardinality is not correct • Index maintenance costs a lot • Hardware resources • Slow updates • Window to run CREATE INDEX Latest Support Tickets 6
  • 9. • Data distribution vary • Cardinality is not correct • Index maintenance costs a lot • Optimizer does not work as we wish to Examples in my talk @Percona Live Latest Support Tickets 6
  • 10. • Topic based on real Support cases • Couple of them are still in progress Disclaimer 7
  • 11. • Topic based on real Support cases • All examples are 100% fake • They created such that • No customer can be identified • Everything generated Table names Column names Data • Use case itself is fictional Disclaimer 7
  • 12. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • Only columns, required to show the issue • Everything extra removed • Real tables usually store much more data Disclaimer 7
  • 13. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • All disasters happened with version 5.7 Disclaimer 7
  • 15. • categories • Less than 20 rows Two tables 9
  • 16. • categories • Less than 20 rows • goods • More than 1M rows • 20 unique cat id values • Many other fields Price Date: added, last updated, etc. Characteristics Store ... Two tables 9
  • 17. select * from goods join categories on (categories.id=goods.cat_id) where date_added between ’2018-07-01’ and ’2018-08-01’ and cat_id in (16,11) and price >= 1000 and <=10000 [ and ... ] [ GROUP BY ... [ORDER BY ... [ LIMIT ...]]] ; JOIN 10
  • 18. • Select from the Small Table Option 1: Select from the Small Table First 11
  • 19. • Select from the Small Table • For each cat id select from the large table Option 1: Select from the Small Table First 11
  • 20. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] Option 1: Select from the Small Table First 11
  • 21. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] • Slow with many items in the category Option 1: Select from the Small Table First 11
  • 22. • Filter rows by date added[ and price[...]] Option 2: Select from the Large Table First 12
  • 23. • Filter rows by date added[ and price[...]] • Get cat id values Option 2: Select from the Large Table First 12
  • 24. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table Option 2: Select from the Large Table First 12
  • 25. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table • Slow if number of rows, filtered by date added, is larger than number of goods in the selected categories Option 2: Select from the Large Table First 12
  • 26. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue What if use Combined Indexes? 13
  • 27. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue • But not in all cases What if use Combined Indexes? 13
  • 28. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space The Problem 14
  • 29. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space • Index not useful for selecting rows JOIN categories ON (categories.id=goods.cat_id) JOIN shops ON (shops.id=goods.shop_id) [ JOIN ... ] WHERE date_added between ’2018-07-01’ and ’2018-08-01’ AND cat_id in (16,11) AND price >= 1000 AND price <=10000 [ AND ... ] GROUP BY product_type ORDER BY date_updated DESC LIMIT 50,100 The Problem 14
  • 30. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space • Index not useful for selecting rows • Tables may have wrong cardinality The Problem 14
  • 31. • EXPLAIN without histograms mysql> explain select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range -> order by goods.cat_id -> limit 10G -- We ask for 10 rows only! Example 15
  • 32. • EXPLAIN without histograms *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table first partitions: NULL type: index possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 20 filtered: 70.00 Extra: Using where; Using index; Using temporary; Using filesort Example 15
  • 33. • EXPLAIN without histograms *************************** 2. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table partitions: NULL type: ref possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: orig.categories.id rows: 51827 filtered: 11.11 -- Default value Extra: Using where 2 rows in set, 1 warning (0.01 sec) Example 15
  • 34. • Execution time without histograms mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; ab9f9bb7bc4f357712ec34f067eda364 - 10 rows in set (56.47 sec) Example 15
  • 35. • Engine statistics without histograms mysql> show status like ’Handler%’; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ ... | Handler_read_next | 964718 | | Handler_read_prev | 0 | | Handler_read_rnd | 10 | | Handler_read_rnd_next | 951671 | ... | Handler_write | 951670 | +----------------------------+--------+ 18 rows in set (0.01 sec) Example 15
  • 36. • Now lets add the histogram mysql> analyze table goods update histogram on date_added; +------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +------------+-----------+----------+------------------------------+ | orig.goods | histogram | status | Histogram statistics created for column ’date_added’. | +------------+-----------+----------+------------------------------+ 1 row in set (2.01 sec) Example 15
  • 37. • EXPLAIN with the histogram mysql> explain select goods.* from goods -> join categories -> on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10G Example 15
  • 38. • EXPLAIN with the histogram *************************** 1. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table first partitions: NULL type: index possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: NULL rows: 10 -- Same as we asked filtered: 98.70 -- True numbers Extra: Using where Example 15
  • 39. • EXPLAIN with the histogram *************************** 2. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: orig.goods.cat_id rows: 1 filtered: 100.00 Extra: Using index 2 rows in set, 1 warning (0.01 sec) Example 15
  • 40. • Execution time with the histogram mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; eeb005fae0dd3441c5c380e1d87fee84 - 10 rows in set (0.00 sec) -- 56/0 times faster! Example 15
  • 41. • Engine statistics with the histogram mysql> show status like ’Handler%’; +----------------------------+-------++----------------------------+-------+ | Variable_name | Value || Variable_name | Value | +----------------------------+-------++----------------------------+-------+ | Handler_commit | 1 || Handler_read_prev | 0 | | Handler_delete | 0 || Handler_read_rnd | 0 | | Handler_discover | 0 || Handler_read_rnd_next | 0 | | Handler_external_lock | 4 || Handler_rollback | 0 | | Handler_mrr_init | 0 || Handler_savepoint | 0 | | Handler_prepare | 0 || Handler_savepoint_rollback | 0 | | Handler_read_first | 1 || Handler_update | 0 | | Handler_read_key | 3 || Handler_write | 0 | | Handler_read_last | 0 |+----------------------------+-------+ | Handler_read_next | 9 |18 rows in set (0.00 sec) Example 15
  • 43. • goods characteristics CREATE TABLE ‘goods_characteristics‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘good_id‘ varchar(30) DEFAULT NULL, ‘size‘ int(11) DEFAULT NULL, ‘manufacturer‘ varchar(30) DEFAULT NULL, PRIMARY KEY (‘id‘), KEY ‘good_id‘ (‘good_id‘,‘size‘,‘manufacturer‘), KEY ‘size‘ (‘size‘,‘manufacturer‘) ) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Two Similar Tables 17
  • 44. • goods shops CREATE TABLE ‘goods_shops‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘good_id‘ varchar(30) DEFAULT NULL, ‘location‘ varchar(30) DEFAULT NULL, ‘delivery_options‘ varchar(30) DEFAULT NULL, PRIMARY KEY (‘id‘), KEY ‘good_id‘ (‘good_id‘,‘location‘,‘delivery_options‘), KEY ‘location‘ (‘location‘,‘delivery_options‘) ) ENGINE=InnoDB AUTO_INCREMENT=131071 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Two Similar Tables 17
  • 45. • Size mysql> select count(*) from goods_characteristics; +----------+ | count(*) | +----------+ | 131072 | +----------+ 1 row in set (0.08 sec) mysql> select count(*) from goods_shops; +----------+ | count(*) | +----------+ | 65536 | +----------+ 1 row in set (0.04 sec) Two Similar Tables 17
  • 46. • Data Distribution: goods characteristics mysql> select count(*) num_rows, good_id, size -> from goods_characteristics group by good_id, size; +----------+---------+------+ | num_rows | good_id | size | +----------+---------+------+ | 65536 | laptop | 7 | | 8187 | laptop | 8 | | 8190 | laptop | 9 | | 8188 | laptop | 10 | | 8192 | laptop | 11 | | 8189 | laptop | 12 | | 8189 | laptop | 13 | | 8191 | laptop | 14 | | 8190 | laptop | 15 | | 10 | laptop | 16 | | 10 | laptop | 17 | +----------+---------+------+ Two Similar Tables 17
  • 47. • Data Distribution: goods characteristics mysql> select count(*) num_rows, good_id, manufacturer -> from goods_characteristics group by good_id, manufacturer order by num_rows desc; +----------+---------+--------------+ | num_rows | good_id | manufacturer | +----------+---------+--------------+ | 65536 | laptop | Noname | | 8191 | laptop | Samsung | | 8191 | laptop | Acer | | 8189 | laptop | Dell | | 8189 | laptop | HP | | 8189 | laptop | Lenovo | | 8189 | laptop | Toshiba | | 8189 | laptop | Apple | | 8189 | laptop | Asus | | 10 | laptop | Sony | | 10 | laptop | Casper | +----------+---------+--------------+ Two Similar Tables 17
  • 48. • Data Distribution: goods shops mysql> select count(*) num_rows, good_id, location -> from goods_shops group by good_id, location order by num_rows desc; +----------+---------+---------------+ | num_rows | good_id | location | +----------+---------+---------------+ | 8191 | laptop | New York | | 8191 | laptop | San Francisco | | 8189 | laptop | Paris | | 8189 | laptop | Berlin | | 8189 | laptop | Brussels | | 8189 | laptop | Tokio | | 8189 | laptop | Istanbul | | 8189 | laptop | London | | 10 | laptop | Moscow | | 10 | laptop | Kiev | +----------+---------+---------------+ Two Similar Tables 17
  • 49. • Data Distribution: goods shops mysql> select count(*) num_rows, good_id, delivery_options -> from goods_shops group by good_id, delivery_options order by num_rows desc; +----------+---------+------------------+ | num_rows | good_id | delivery_options | +----------+---------+------------------+ | 8192 | laptop | DHL | | 8191 | laptop | PTT | | 8190 | laptop | Normal Post | | 8190 | laptop | Tracked | | 8189 | laptop | Fedex | | 8189 | laptop | Gruzovichkof | | 8188 | laptop | Courier | | 8187 | laptop | No delivery | | 10 | laptop | Premium | | 10 | laptop | Urgent | +----------+---------+------------------+ Two Similar Tables 17
  • 50. Histogram statistics are useful primarily for nonindexed columns. Adding an index to a column for which histogram statistics are applicable might also help the optimizer make row estimates. The tradeoffs are: An index must be updated when table data is modified. A histogram is created or updated only on demand, so it adds no overhead when table data is modified. On the other hand, the statistics become progres- sively more out of date when table modifications occur, until the next time they are updated. MySQL User Reference Manual Optimizer Statistics aka Histograms 18
  • 51. mysql> alter table goods_characteristics stats_sample_pages=5000; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table goods_shops stats_sample_pages=5000; Query OK, 0 rows affected (0.05 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> analyze table goods_characteristics, goods_shops; +----------------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------------------------+---------+----------+----------+ | test.goods_characteristics | analyze | status | OK | | test.goods_shops | analyze | status | OK | +----------------------------+---------+----------+----------+ 2 rows in set (0.35 sec) Index Statistics is More than Good 19
  • 52. • The query mysql> select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); ^C^C -- query aborted ERROR 1317 (70100): Query execution was interrupted Performance? 20
  • 53. • Handlers mysql> show status like ’Handler%’; +----------------------------+-------------+ | Variable_name | Value | +----------------------------+-------------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 4 | | Handler_mrr_init | 0 | | Handler_prepare | 0 | | Handler_read_first | 1 | | Handler_read_key | 13043 | | Handler_read_last | 0 | | Handler_read_next | 854,767,916 | ... Performance? 20
  • 54. • Table order mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_characteristics | index | good_id | 131072 | 25.00 | Using where; Using index | | 1 | goods_shops | ref | good_id | 65536 | 36.00 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Performance? 20
  • 55. • Table order matters mysql> explain select count(*) from goods_shops straight_join goods_characteristics -> using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_shops | index | good_id | 65536 | 36.00 | Using where; Using index | | 1 | goods_characteristics | ref | good_id | 131072 | 25.00 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Performance? 20
  • 56. • Table order matters mysql> select count(*) from goods_shops straight_join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----------+ | count(*) | +----------+ | 816640 | +----------+ 1 row in set (2.11 sec) mysql> show status like ’Handler_read_next’; +-------------------+-----------+ | Variable_name | Value | +-------------------+-----------+ | Handler_read_next | 5,308,416 | +-------------------+-----------+ 1 row in set (0.00 sec) Performance? 20
  • 57. mysql> analyze table goods_shops update histogram on location, delivery_options; +-------------+-----------+----------+-----------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+-----------------------------------------------------+ | goods_shops | histogram | status | Histogram statistics created... ’delivery_options’. | | goods_shops | histogram | status | Histogram statistics created for column ’location’. | +-------------+-----------+----------+-----------------------------------------------------+ 2 rows in set (0.18 sec) mysql> analyze table goods_characteristics update histogram on size, manufacturer ; +-----------------------+-----------+----------+-------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------------+-----------+----------+-------------------------------------------------+ | goods_characteristics | histogram | status | Histogram statistics created... ’manufacturer’. | | goods_characteristics | histogram | status | Histogram statistics created for column ’size’. | +-----------------------+-----------+----------+-------------------------------------------------+ 2 rows in set (0.23 sec) Histograms to Rescue 21
  • 58. • The query mysql> select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----------+ | count(*) | +----------+ | 816640 | +----------+ 1 row in set (2.16 sec) mysql> show status like ’Handler_read_next’; +-------------------+-----------+ | Variable_name | Value | +-------------------+-----------+ | Handler_read_next | 5,308,418 | +-------------------+-----------+ 1 row in set (0.00 sec) Histograms to Rescue 21
  • 59. • Filtering effect mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) where s +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_shops | index | good_id | 65536 | 0.06 | Using where; Using index | | 1 | goods_characteristics | ref | good_id | 131072 | 15.63 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Histograms to Rescue 21
  • 61. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Indexes: Number of Items with Same Value 23
  • 62. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Indexes: Cardinality 24
  • 63. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Histograms: Number of Values in Each Bucket 25
  • 64. 1 2 3 4 5 6 7 8 9 10 0 0.2 0.4 0.6 0.8 1 Histograms: Data in the Histogram 26
  • 67. ↓ sql/sql planner.cc ↓ calculate condition filter Low Level 28
  • 68. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect Low Level 28
  • 69. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect • get histogram selectivity Low Level 28
  • 70. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect • get histogram selectivity • Seen as a percent of filtered rows in EXPLAIN Low Level 28
  • 71. • Example data mysql> create table example(f1 int) engine=innodb; mysql> insert into example values(1),(1),(1),(2),(3); mysql> select f1, count(f1) from example group by f1; +------+-----------+ | f1 | count(f1) | +------+-----------+ | 1 | 3 | | 2 | 1 | | 3 | 1 | +------+-----------+ 3 rows in set (0.00 sec) Filtered Rows 29
  • 72. • Without a histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 73. • Without a histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 74. • Without a histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 75. • Without a histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 76. • With the histogram mysql> analyze table example update histogram on f1 with 3 buckets; +-----------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------+-----------+----------+------------------------------+ | hist_ex.example | histogram | status | Histogram statistics created for column ’f1’. | +-----------------+-----------+----------+------------------------------+ 1 row in set (0.03 sec) Filtered Rows 29
  • 77. • With the histogram mysql> select * from information_schema.column_statistics -> where table_name=’example’G *************************** 1. row *************************** SCHEMA_NAME: hist_ex TABLE_NAME: example COLUMN_NAME: f1 HISTOGRAM: "buckets": [[1, 0.6], [2, 0.8], [3, 1.0]], "data-type": "int", "null-values": 0.0, "collation-id": 8, "last-updated": "2018-11-07 09:07:19.791470", "sampling-rate": 1.0, "histogram-type": "singleton", "number-of-buckets-specified": 3 1 row in set (0.00 sec) Filtered Rows 29
  • 78. • With the histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 100.00 -- all rows Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 79. • With the histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 40.00 -- 2 rows Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 80. • With the histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 -- one row Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 81. • With the histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 - one row Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 82. • CREATE INDEX • Metadata lock • Can be blocked by any query Locking 30
  • 83. • CREATE INDEX • Metadata lock • Can be blocked by any query • UPDATE HISTOGRAM • Backup lock • Can be locked only by a backup • Can be created any time without fear Locking 30
  • 84. • Helps if query plan can be changed • Not a replacement for the index: • GROUP BY • ORDER BY • Query on a single table ∗ Outcome 31
  • 85. • Data distribution is uniform • Range optimization can be used • Full table scan is fast When Histogram are not Helpful? 32
  • 86. • Index statistics collected by the engine • Optimizer calculates Cardinality each time when accesses statistics • Indexes not always improve performance • Histograms can help Still new feature • Histograms do not replace other optimizations! Conclusion 33
  • 87. MySQL User Reference Manual Blog by Erik Froseth Blog by Frederic Descamps Talk by Oystein Grovlen @Fosdem Talk by Sergei Petrunia @PerconaLive WL #8707 More information 34