SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
MySQL best practices
Design, use and optimization
Iván López ivan@trovit.com 2014
1
Table design
Database engine (table format)
2
InnoDB table format
• Use InnoDB by default
• InnoDB vs MyISAM pros:
• Row locking, allows concurrent writes
• ACID
• Non-blocking backups
• Better data recovery after a crash
• InnoDB cons:
• Lack of instant row count
3
What about MyISAM?
• Never use MyISAM for concurrent applications in
real time
• Possible use cases for MyISAM:
• No writes
• Batch writes only
• Mostly full scan selects
4
Table design
Segmentation and specialisation
5
Table segmentation
• For very big tables or tables that will grow forever,
find some criteria to segment data into smaller
tables
• For instance:
• one table per month for log recording
• one table per country for users, so you can
shard them and put them on different servers
closer to the users
6
Table specialisation
• Don’t use the same table for heterogeneous
records that don’t share the same fields, it will
increase table size and affect performance when
using indexes
• For instance:
• In a database for classified ads for homes,
cars and jobs, use one table for each type,
because they don’t share fields like number of
rooms, engine power or salary.
7
Table design
Primary Key in InnoDB
8
Clustered index
1 1 2 A B … … …
Primary key
1 1 1 A A … … …
1 1 1A A
Row data
Primary keySecondary key 1
1 2 2A A
1 1 2A B
1 2 1 A B … … …
9
1 2 2 A A … … …
1 2 1A B
Clustered index
• You must always define a primary key, if there’s no
natural PK for the table, define an auto incremental PK
• Records are physically ordered in table by the PK
• Accessing a row using the PK is the fastest way,
because the row data is on the same page where the
index search leads
• Don’t include fields in the PK that could be modified
after insertion, it will delete and insert the record again at
the right position if you update the PK, affecting
performance
10
Secondary indexes
• All indexes other than the clustered index are known as secondary
indexes
• Each record in a secondary index contains the primary key columns
for the row, as well as the columns specified for the secondary index
• InnoDB uses this primary key value to search for the row in the
clustered index
• You could take profit from this design for paging by PK in selects
that use a secondary index
• If the primary key is long, the secondary indexes use more space,
so it is better to have a small auto increment PK and define a
unique key with the fields that would be the natural PK
11
Long Primary Key
CREATE  TABLE  `jobs_tbl_trovit_stats`  (  
    `s_query`  varchar(255)  NOT  NULL,  
    `d_date`  date  NOT  NULL,  
    `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,  
    `i_num_ads`  int(10)  DEFAULT  NULL,  
    `i_num_ads_salary`  int(10)  DEFAULT  NULL,  
    `i_num_ads_region`  int(10)  DEFAULT  NULL,  
    `i_num_ads_city`  int(10)  DEFAULT  NULL,  
    `i_num_ads_company`  int(10)  DEFAULT  NULL,  
    `i_num_ads_experience`  int(10)  DEFAULT  NULL,  
    `i_num_ads_contract`  int(10)  DEFAULT  NULL,  
    `f_avg_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_region_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_city_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_company_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_experience_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_contract_salary`  int(10)  DEFAULT  NULL,  
    `s_similar_hash`  varchar(255)  NOT  NULL,  
    PRIMARY  KEY  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),  
    KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),  
    KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),  
    KEY  `d_date`  (`d_date`)  
)  ENGINE=InnoDB  DEFAULT  CHARSET=utf8  
5MM  records  =  2600MB  disk  space
12
Auto increment Primary Key
CREATE  TABLE  `jobs_tbl_trovit_stats_NEW`  (  
    `i_id`  int(10)  unsigned  NOT  NULL  AUTO_INCREMENT,  
    `s_query`  varchar(255)  NOT  NULL,  
    `d_date`  date  NOT  NULL,  
    `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,  
    `i_num_ads`  int(10)  DEFAULT  NULL,  
    `i_num_ads_salary`  int(10)  DEFAULT  NULL,  
    `i_num_ads_region`  int(10)  DEFAULT  NULL,  
    `i_num_ads_city`  int(10)  DEFAULT  NULL,  
    `i_num_ads_company`  int(10)  DEFAULT  NULL,  
    `i_num_ads_experience`  int(10)  DEFAULT  NULL,  
    `i_num_ads_contract`  int(10)  DEFAULT  NULL,  
    `f_avg_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_region_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_city_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_company_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_experience_salary`  int(10)  DEFAULT  NULL,  
    `f_avg_contract_salary`  int(10)  DEFAULT  NULL,  
    `s_similar_hash`  varchar(255)  NOT  NULL,  
    PRIMARY  KEY  (`i_id`),  
    UNIQUE  KEY  `unique_key`  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),  
    KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),  
    KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),  
    KEY  `d_date`  (`d_date`)  
)  ENGINE=InnoDB  AUTO_INCREMENT=6462849  DEFAULT  CHARSET=utf8  
5MM  records  =  2100MB  -­‐20%  disk  space
13
Using indexes
14
Pros of indexes
1. Filter: access only the records you need, without
considering more records than necessary. Applies
to SELECT, UPDATE, REPLACE and DELETE
2. Sor/group: avoid using temporary tables
3. Cover: if all fields in a SELECT are included in the
index used (despite its order), data is retrieved
directly from this index, saving extra reads
15
Cons of indexes
1. Writes: the more indexes a table has, the slower
writes are going to be
2. Size: each index is going to increase total table
size
16
Rules for using indexes
• Only one index is used for each table in a query
• In case of using “OR” in a WHERE condition, it works as many different
queries, and each one could use its own index
• Fields use order in index is from left to right, always beginning with the first
one
• Its not mandatory to use all fields in an index, but every field used must be
consecutive
• Fields in WHERE condition go first, next GROUP BY and ORDER BY last
• Order of fields in WHERE condition doesn’t matter
• A range condition in WHERE or using GROUP BY or ORDER BY will prevent
using the next fields in the index
17
Rules for using indexes
fields use order
18
You can’t skip previous fields if you want to filter using the index for
rightmost fields
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  *  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  d_date  =  ‘2014-­‐10-­‐10’  
AND  fk_i_id_tbl_vertical  =  1;  
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  *  FROM  table  
WHERE  d_date  =  ‘2014-­‐10-­‐10’  
AND  fk_i_id_tbl_vertical  =  1;  
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  *  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  fk_i_id_tbl_vertical  =  1;  
Rules for using indexes
fields order in ranges, groups and sorts
19
None of this examples could filter using the index for the field
fk_i_id_tbl_vertical
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  *  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  d_date  <  ‘2014-­‐10-­‐10’  
AND  fk_i_id_tbl_vertical  =  1;  
SELECT  *  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  fk_i_id_tbl_vertical  =  1,  
GROUP  BY  d_date;  
SELECT  *  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  fk_i_id_tbl_vertical  =  1  
ORDER  BY  d_date;
Rules for using indexes
covering indexes and fields order
20
This query will use the covering index for all the requested fields, also uses
the index for filtering the first field, but can’t use it for filtering the next fields
because the WHERE condition is skipping the second field in the index
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  fk_i_id_tbl_vertical  =  1;  
This query will use the covering index for all the requested fields, but can’t
use it for filtering because the WHERE condition is skipping the first field in
the index
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table  
WHERE  d_date  <  ‘2014-­‐10-­‐10’  
AND  fk_i_id_tbl_vertical  =  1;  
Rules for using indexes
covering indexes and not indexed fields
21
You can’t use covering index optimization if any of the
requested fields is not included in the index
KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  
SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical,  s_query  FROM  table  
WHERE  fk_c_id_tbl_countries  =  ‘es’  
AND  d_date  =  ‘2014-­‐10-­‐10’  
AND  fk_i_id_tbl_vertical  =  1;  
Good index design
22
Duplicated indexes
• Avoid duplicating fields in different indexes, it
will affect write performance and increase table
size
• Think about the most frequent uses of the table, so
you can design the table itself and order the fields
in indexes smartly to avoid duplicate fields
23
Promote covering indexes
• Consider adding frequently requested fields at
the end of an index, even if they aren’t used in
WHERE, GROUP BY or ORDER BY
• If the major part of queries on a table are optimised
to use use covering indexes, is the most
important performance boost you can get
24
Index cardinality
• Cardinality is how many unique values an index has
• The more cardinality, the more efficient an index is
filtering records
• MySQL maintains approximate statistics about
cardinality
• Each MySQL version gives different cardinality
values, and they can become out of date under high
write load
25
Index cardinality
26
mysql>  analyze  table  users;  
mysql>  show  index  from  users;  
+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+  
|  Table  |  Non_unique  |  Key_name  |  Seq  |  Column_name  |  Cardinality  |  
+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+  
|  users  |                    0  |  PRIMARY    |      1  |  id                    |          9728224  |  
|  users  |                    1  |  age_sex    |      1  |  age                  |                  192  |  
|  users  |                    1  |  age_sex    |      2  |  sex                  |                  406  |  
|  users  |                    1  |  sex_age    |      1  |  sex                  |                      2  |  
|  users  |                    1  |  sex_age    |      2  |  age                  |                  406  |  
|  users  |                    1  |  name          |      1  |  name                |              38149  |  
|  users  |                    1  |  active      |      1  |  active            |                      2  |  
+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+
Cardinality and distribution
27
Cardinality
version 5.5
Cardinality
version 5.6
count(distinct) Distribution
Filter
efficiency
PRIMARY 10.000.267 9.728.224 10.000.000 única optimum
name 18,868 38,149 10,001 ±0,01%  c/u very good
age 18 192 101 ±1%  c/u good
sex 18 2 2 50%  c/u fair
active 18 2 2
‘0’:    0,1%  
‘1’:  99,9%
‘0’ very good
‘1’ very bad
Optimising queries
28
Using EXPLAIN
29
EXPLAIN fields
id Id of the SELECT, if there are more than one
select_type Query type (SIMPLE, UNION, SUBQUERY…)
table Table name
type Record search strategy
possible_keys Indexes to consider
key Index used
key_len Used length of the index
ref Fields matched with the index
rows Approximate record number to consider
filtered Percentage of filtered records by WHERE
Extra Additional info
30
Search strategies
from most to less optimal
• const: just one record, by primary key or unique index
• eq_ref: just one record for each other record in a JOIN
• ref: multiple records filtering by index
• index_merge: multiple records filtering by more than one index
• range: multiple records filtering by range using index
• index: read all records in the index file (index scan)
• ALL: read all records in the data file (full scan)
Information in the “Extra” field
• Using where: records are filtered after reading them, using the
WHERE condition (an index wasn’t able to filter all of them)
• Using index: all field data is read directly from the index, without
accessing the data file (covering index)
• Using where, Using index: as with “Using where”, records are
filtered after reading them, but data comes from an index
• Using filesort: extra step to sort after filtering records, when
records were not read in order from an index, using a temporary file
• Using temporary: temporary tables are needed to complete some
steps and satisfy the query (in memory or disk)
Guide to understand EXPLAIN
• “Using index” will boost query performance (covering index),
specially with lots of results
• Avoid “Using filesort” and “Using temporary”, specially with lots
of results
• “Using where” in queries of the types “ALL” o “index” means
that is not possible to discard any record directly from the index,
and they will be filtered while reading all the records
• Watch how may records are going to be read approximately
according to the field “rows”
• Watch the effective used length of the index according to the field
“key_len”
33
Just one record by Primary Key
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  
        -­‐>  WHERE  s_query  =  'account  executive'  
        -­‐>  AND  fk_i_id_tbl_type_dates  =  1  
        -­‐>  AND  d_date  =  '2009-­‐09-­‐28'G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  const  
possible_keys:  PRIMARY,s_query_i_num_ads_salary_d_date,d_date  
                    key:  PRIMARY  
            key_len:  774  
                    ref:  const,const,const  
                  rows:  1
34
Just one record by UNIQUE KEY
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  
        -­‐>  WHERE  s_query  =  'account  executive'  
        -­‐>  AND  fk_i_id_tbl_type_dates  =  1  
        -­‐>  AND  d_date  =  '2009-­‐09-­‐28'G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats_NEW  
                  type:  const  
possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date  
                    key:  unique_key  
            key_len:  774  
                    ref:  const,const,const  
                  rows:  1
35
JOIN of just one record
mysql>  EXPLAIN  SELECT  *  FROM  tbl_users  LEFT  JOIN  tbl_countries  
        -­‐>  ON  tbl_users.fk_c_id_tbl_countries  =  tbl_countries.c_idG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  tbl_users  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  20986861  
***************************  2.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  tbl_countries  
                  type:  eq_ref  
possible_keys:  PRIMARY  
                    key:  PRIMARY  
            key_len:  6  
                    ref:  trovit_global.tbl_users.fk_c_id_tbl_countries  
                  rows:  1
36
Many records using a partial
Unique Key
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  
        -­‐>  WHERE  s_query  =  'account  executive'G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats_NEW  
                  type:  ref  
possible_keys:  unique_key,s_query_i_num_ads_salary_d_date  
                    key:  unique_key  
            key_len:  767  
                    ref:  const  
                  rows:  1073
37
Many records using index
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  
        -­‐>  WHERE  d_date  =  '2012-­‐10-­‐15'G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  ref  
possible_keys:  d_date  
                    key:  d_date  
            key_len:  3  
                    ref:  const  
                  rows:  10908
38
OR condition and indexes
mysql>  EXPLAIN  SELECT  a  FROM  t  
        -­‐>  WHERE  b  =  1  OR  c  =  1G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  t  
                  type:  index_merge  
possible_keys:  b,c,b_a  
                    key:  b,c  
            key_len:  4,4  
                    ref:  NULL  
                  rows:  4863128  
                Extra:  Using  union(b,c);  Using  where
39
Range query using index
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  
        -­‐>  WHERE  s_query  =  'account  executive’  
        -­‐>  AND  fk_i_id_tbl_type_dates  =  1  
        -­‐>  AND  d_date  <  ‘2011-­‐04-­‐25'G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_bl_trovit_stats  
                  type:  range  
possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date  
                    key:  unique_key  
            key_len:  774  
                    ref:  NULL  
                  rows:  539
40
Condition is not using index,
but using covering index
mysql>  EXPLAIN  SELECT  s_query  FROM  jobs_tbl_trovit_stats  
        -­‐>  WHERE  i_num_ads_salary  =  10G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  index  
possible_keys:  NULL  
                    key:  s_query_i_num_ads_salary_d_date  
            key_len:  775  
                    ref:  NULL  
                  rows:  5543853  
                Extra:  Using  where;  Using  index
41
Full scan query, filter condition
doesn’t use index
mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_stats  
        -­‐>  WHERE  i_num_ads_salary  =  10G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  5543853  
                Extra:  Using  where
42
Full scan query, retrieve all
records
mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_statsG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  5543853  
                Extra:  NULL
43
EXPLAIN and the
MySQL query optimizer
44
45
KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id  
KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id  
mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t  
        -­‐>  WHERE  a  =  1  ORDER  BY  idG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  t  
                  type:  ref  
possible_keys:  a,a_b,a_b_c_d  
                    key:  a_b_c_d  
            key_len:  4  
                    ref:  const  
                  rows:  199704  
                Extra:  Using  where;  Using  index;  Using  filesort  
200000  rows  in  set  (0.28  sec)
The importance of “using index”
query optimizer choses wisely and favours the performance boost of
“using index” even if it’s forced to “using filesort”
46
KEY  `a`  (`a`)  +  id  
mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t  
        -­‐>  FORCE  KEY  (a)  
        -­‐>  WHERE  a  =  1  ORDER  BY  idG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  t  
                  type:  ref  
possible_keys:  a  
                    key:  a  
            key_len:  4  
                    ref:  const  
                  rows:  199704  
                Extra:  Using  where  
200000  rows  in  set  (0.53  sec)
The importance of “using index”
if we choose to force an index we think it would be better for performance (in this case the
bad extra “using filesort” is not used), we might be mistaken and the query could be slower
47
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  
        -­‐>  ORDER  BY  d_dateG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  jobs_tbl_trovit_stats  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  5543853  
                Extra:  Using  filesort  
5066959  rows  in  set  (3  min  27.96  sec)
But sometimes query optimizer is wrong
full scan query, it prefers to read only the data file and do a filesort, instead of using an
index to first: read keys directly in order and second: access the data file to retrieve fields
48
mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  
        -­‐>  FORCE  KEY  (d_date)  
        -­‐>  ORDER  BY  d_dateG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  homes_tbl_my_searches  
                  type:  index  
possible_keys:  NULL  
                    key:  d_date  
            key_len:  3  
                    ref:  NULL  
                  rows:  5543853  
5066959  rows  in  set  (18.11  sec)
But sometimes query optimizer is wrong
same query, but this time is an index scan, with millions of records, is much better to force
the use of an index that satisfies the sorting of records, even if we must do a second read
from the data file to retrieve the fields for every record
Optimising queries
Best practices
49
Denormalize dates
don’t use DATE_FORMAT in WHERE, GROUP BY or ORDER BY
mysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,  
        -­‐>  DATE_FORMAT(d_date,  “%Y-­‐%m”)  FROM  tbl_publishers_stats  
        -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'  
        -­‐>  GROUP  BY  YEAR(d_date),  MONTH(d_date)G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  tbl_publishers_stats  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  323826  
                Extra:  Using  where;  Using  temporary;  Using  filesort
50
Denormalize dates
use specific indexed fields for year, month, day, etc
mysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,  
        -­‐>  i_year,  i_month  FROM  tbl_publishers_stats  
        -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'  
        -­‐>  GROUP  BY  i_year,  i_monthG  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  tbl_publishers_stats_NEW  
                  type:  ref  
possible_keys:  publisher_year_month  
                    key:  publisher_year_month  
            key_len:  5  
                    ref:  const  
                  rows:  236  
                Extra:  Using  where
51
Avoid using “*”
• Avoid using “*” in the field list of a SELECT
• Put only the fields you need in order to save unneeded disk reads
• Big extra optimisation if all the fields belong to the index used in
the query (covering index)
• Exception for using “*”: count(*) must always be used instead of
count(field_name):
• To avoid causing confusion to the query optimiser
• Avoids future errors if the query is modified and the field
inside the count() function is no longer indexed
52
Page queries
• Avoid running any long executing query
• Long queries block the execution of MySQL internal
maintenance processes, like the purge history
growing several gigabytes that would never shrink
again
• Long INSERT, UPDATE and DELETE, in addition to the
above, will delay replication to slaves
• Use LIMIT in queries of any type to page in smaller
and faster blocks
Using LIMIT the right way
• LIMIT filters final results of the query, only after
WHERE, GROUP BY and ORDER BY are
processed
• Beware of GROUP BY and ORDER BY “using
filesort”, all records are going to be file sorted
before LIMIT could take effect
• Even when using an index, the LIMIT
<offset>,<row_count> syntax will run slower
and slower as the offset increments
54
Using LIMIT the right way
always avoid “using filesort”
55
mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches  
        -­‐>  ORDER  BY  s_where  
        -­‐>  LIMIT  10G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  homes_tbl_my_searches  
                  type:  ALL  
possible_keys:  NULL  
                    key:  NULL  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  746475  
                Extra:  Using  filesort  
10  rows  in  set  (4.00  sec)
Using LIMIT the right way
always avoid “using filesort”
56
mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches  
        -­‐>  ORDER  BY  s_what  
        -­‐>  LIMIT  10G  
***************************  1.  row  ***************************  
                      id:  1  
    select_type:  SIMPLE  
                table:  homes_tbl_my_searches  
                  type:  index  
possible_keys:  NULL  
                    key:  what_where  
            key_len:  NULL  
                    ref:  NULL  
                  rows:  10  
10  rows  in  set  (0.00  sec)
#  Slower  as  offset  increments  
mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW  
        -­‐>  LIMIT  5000000,100;  
100  rows  in  set  (0.92  sec)  
      #  Always  fast  
mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW  
        -­‐>  WHERE  i_id  >  $last_id  
        -­‐>  LIMIT  100;  
100  rows  in  set  (0.00  sec)
Using LIMIT the right way
• When paging, avoid LIMIT <offset>,<row_count>
it’s better to filter using a primary or unique key
57
LIMIT for long UPDATE and DELETE
• Divide long UPDATE and DELETE queries in several
shorter executions using LIMIT
• If possible, use indexed fields to find records
58
#  Crontab  to  purge  old  records  
Do  
mysql>  DELETE  FROM  homes_tbl_my_searches  
        -­‐>  WHERE  dt_date  <  ‘2013-­‐12-­‐31’  
        -­‐>  LIMIT  1000;  
While(rows  affected  >  0)  
#  One-­‐time  UPDATE,  not  worth  to  create  an  index  just  for  this  time  
Do  
mysql>  UPDATE  homes_tbl_my_searches  SET  i_active  =  0  
        -­‐>  WHERE  i_active  !=  0  AND  s_what  =  ‘offensive  stopword’  
        -­‐>  LIMIT  1000;  
While(rows  changed  >  0)

Weitere ähnliche Inhalte

Ähnlich wie MySQL best practices at Trovit

Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
paulguerin
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
oysteing
 

Ähnlich wie MySQL best practices at Trovit (20)

MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing : Improving Query Performance Using Index (Covering Index)MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
3 indexes
3 indexes3 indexes
3 indexes
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
 
Migration from mysql to elasticsearch
Migration from mysql to elasticsearchMigration from mysql to elasticsearch
Migration from mysql to elasticsearch
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
 
Part5 sql tune
Part5 sql tunePart5 sql tune
Part5 sql tune
 
Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]
 
MySQL Performance Optimization
MySQL Performance OptimizationMySQL Performance Optimization
MySQL Performance Optimization
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
MySQL Performance Optimization
MySQL Performance OptimizationMySQL Performance Optimization
MySQL Performance Optimization
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
 
Confoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & HistogramsConfoo 2021 - MySQL Indexes & Histograms
Confoo 2021 - MySQL Indexes & Histograms
 
Presentation interpreting execution plans for sql statements
Presentation    interpreting execution plans for sql statementsPresentation    interpreting execution plans for sql statements
Presentation interpreting execution plans for sql statements
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
 
MariaDB Optimizer
MariaDB OptimizerMariaDB Optimizer
MariaDB Optimizer
 

Kürzlich hochgeladen

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Kürzlich hochgeladen (20)

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 

MySQL best practices at Trovit

  • 1. MySQL best practices Design, use and optimization Iván López ivan@trovit.com 2014 1
  • 2. Table design Database engine (table format) 2
  • 3. InnoDB table format • Use InnoDB by default • InnoDB vs MyISAM pros: • Row locking, allows concurrent writes • ACID • Non-blocking backups • Better data recovery after a crash • InnoDB cons: • Lack of instant row count 3
  • 4. What about MyISAM? • Never use MyISAM for concurrent applications in real time • Possible use cases for MyISAM: • No writes • Batch writes only • Mostly full scan selects 4
  • 5. Table design Segmentation and specialisation 5
  • 6. Table segmentation • For very big tables or tables that will grow forever, find some criteria to segment data into smaller tables • For instance: • one table per month for log recording • one table per country for users, so you can shard them and put them on different servers closer to the users 6
  • 7. Table specialisation • Don’t use the same table for heterogeneous records that don’t share the same fields, it will increase table size and affect performance when using indexes • For instance: • In a database for classified ads for homes, cars and jobs, use one table for each type, because they don’t share fields like number of rooms, engine power or salary. 7
  • 9. Clustered index 1 1 2 A B … … … Primary key 1 1 1 A A … … … 1 1 1A A Row data Primary keySecondary key 1 1 2 2A A 1 1 2A B 1 2 1 A B … … … 9 1 2 2 A A … … … 1 2 1A B
  • 10. Clustered index • You must always define a primary key, if there’s no natural PK for the table, define an auto incremental PK • Records are physically ordered in table by the PK • Accessing a row using the PK is the fastest way, because the row data is on the same page where the index search leads • Don’t include fields in the PK that could be modified after insertion, it will delete and insert the record again at the right position if you update the PK, affecting performance 10
  • 11. Secondary indexes • All indexes other than the clustered index are known as secondary indexes • Each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index • InnoDB uses this primary key value to search for the row in the clustered index • You could take profit from this design for paging by PK in selects that use a secondary index • If the primary key is long, the secondary indexes use more space, so it is better to have a small auto increment PK and define a unique key with the fields that would be the natural PK 11
  • 12. Long Primary Key CREATE  TABLE  `jobs_tbl_trovit_stats`  (      `s_query`  varchar(255)  NOT  NULL,      `d_date`  date  NOT  NULL,      `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,      `i_num_ads`  int(10)  DEFAULT  NULL,      `i_num_ads_salary`  int(10)  DEFAULT  NULL,      `i_num_ads_region`  int(10)  DEFAULT  NULL,      `i_num_ads_city`  int(10)  DEFAULT  NULL,      `i_num_ads_company`  int(10)  DEFAULT  NULL,      `i_num_ads_experience`  int(10)  DEFAULT  NULL,      `i_num_ads_contract`  int(10)  DEFAULT  NULL,      `f_avg_salary`  int(10)  DEFAULT  NULL,      `f_avg_region_salary`  int(10)  DEFAULT  NULL,      `f_avg_city_salary`  int(10)  DEFAULT  NULL,      `f_avg_company_salary`  int(10)  DEFAULT  NULL,      `f_avg_experience_salary`  int(10)  DEFAULT  NULL,      `f_avg_contract_salary`  int(10)  DEFAULT  NULL,      `s_similar_hash`  varchar(255)  NOT  NULL,      PRIMARY  KEY  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),      KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),      KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),      KEY  `d_date`  (`d_date`)   )  ENGINE=InnoDB  DEFAULT  CHARSET=utf8   5MM  records  =  2600MB  disk  space 12
  • 13. Auto increment Primary Key CREATE  TABLE  `jobs_tbl_trovit_stats_NEW`  (      `i_id`  int(10)  unsigned  NOT  NULL  AUTO_INCREMENT,      `s_query`  varchar(255)  NOT  NULL,      `d_date`  date  NOT  NULL,      `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,      `i_num_ads`  int(10)  DEFAULT  NULL,      `i_num_ads_salary`  int(10)  DEFAULT  NULL,      `i_num_ads_region`  int(10)  DEFAULT  NULL,      `i_num_ads_city`  int(10)  DEFAULT  NULL,      `i_num_ads_company`  int(10)  DEFAULT  NULL,      `i_num_ads_experience`  int(10)  DEFAULT  NULL,      `i_num_ads_contract`  int(10)  DEFAULT  NULL,      `f_avg_salary`  int(10)  DEFAULT  NULL,      `f_avg_region_salary`  int(10)  DEFAULT  NULL,      `f_avg_city_salary`  int(10)  DEFAULT  NULL,      `f_avg_company_salary`  int(10)  DEFAULT  NULL,      `f_avg_experience_salary`  int(10)  DEFAULT  NULL,      `f_avg_contract_salary`  int(10)  DEFAULT  NULL,      `s_similar_hash`  varchar(255)  NOT  NULL,      PRIMARY  KEY  (`i_id`),      UNIQUE  KEY  `unique_key`  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),      KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),      KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),      KEY  `d_date`  (`d_date`)   )  ENGINE=InnoDB  AUTO_INCREMENT=6462849  DEFAULT  CHARSET=utf8   5MM  records  =  2100MB  -­‐20%  disk  space 13
  • 15. Pros of indexes 1. Filter: access only the records you need, without considering more records than necessary. Applies to SELECT, UPDATE, REPLACE and DELETE 2. Sor/group: avoid using temporary tables 3. Cover: if all fields in a SELECT are included in the index used (despite its order), data is retrieved directly from this index, saving extra reads 15
  • 16. Cons of indexes 1. Writes: the more indexes a table has, the slower writes are going to be 2. Size: each index is going to increase total table size 16
  • 17. Rules for using indexes • Only one index is used for each table in a query • In case of using “OR” in a WHERE condition, it works as many different queries, and each one could use its own index • Fields use order in index is from left to right, always beginning with the first one • Its not mandatory to use all fields in an index, but every field used must be consecutive • Fields in WHERE condition go first, next GROUP BY and ORDER BY last • Order of fields in WHERE condition doesn’t matter • A range condition in WHERE or using GROUP BY or ORDER BY will prevent using the next fields in the index 17
  • 18. Rules for using indexes fields use order 18 You can’t skip previous fields if you want to filter using the index for rightmost fields KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  *  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  d_date  =  ‘2014-­‐10-­‐10’   AND  fk_i_id_tbl_vertical  =  1;   KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  *  FROM  table   WHERE  d_date  =  ‘2014-­‐10-­‐10’   AND  fk_i_id_tbl_vertical  =  1;   KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  *  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  fk_i_id_tbl_vertical  =  1;  
  • 19. Rules for using indexes fields order in ranges, groups and sorts 19 None of this examples could filter using the index for the field fk_i_id_tbl_vertical KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  *  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  d_date  <  ‘2014-­‐10-­‐10’   AND  fk_i_id_tbl_vertical  =  1;   SELECT  *  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  fk_i_id_tbl_vertical  =  1,   GROUP  BY  d_date;   SELECT  *  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  fk_i_id_tbl_vertical  =  1   ORDER  BY  d_date;
  • 20. Rules for using indexes covering indexes and fields order 20 This query will use the covering index for all the requested fields, also uses the index for filtering the first field, but can’t use it for filtering the next fields because the WHERE condition is skipping the second field in the index KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  fk_i_id_tbl_vertical  =  1;   This query will use the covering index for all the requested fields, but can’t use it for filtering because the WHERE condition is skipping the first field in the index KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table   WHERE  d_date  <  ‘2014-­‐10-­‐10’   AND  fk_i_id_tbl_vertical  =  1;  
  • 21. Rules for using indexes covering indexes and not indexed fields 21 You can’t use covering index optimization if any of the requested fields is not included in the index KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)   SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical,  s_query  FROM  table   WHERE  fk_c_id_tbl_countries  =  ‘es’   AND  d_date  =  ‘2014-­‐10-­‐10’   AND  fk_i_id_tbl_vertical  =  1;  
  • 23. Duplicated indexes • Avoid duplicating fields in different indexes, it will affect write performance and increase table size • Think about the most frequent uses of the table, so you can design the table itself and order the fields in indexes smartly to avoid duplicate fields 23
  • 24. Promote covering indexes • Consider adding frequently requested fields at the end of an index, even if they aren’t used in WHERE, GROUP BY or ORDER BY • If the major part of queries on a table are optimised to use use covering indexes, is the most important performance boost you can get 24
  • 25. Index cardinality • Cardinality is how many unique values an index has • The more cardinality, the more efficient an index is filtering records • MySQL maintains approximate statistics about cardinality • Each MySQL version gives different cardinality values, and they can become out of date under high write load 25
  • 26. Index cardinality 26 mysql>  analyze  table  users;   mysql>  show  index  from  users;   +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+   |  Table  |  Non_unique  |  Key_name  |  Seq  |  Column_name  |  Cardinality  |   +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+   |  users  |                    0  |  PRIMARY    |      1  |  id                    |          9728224  |   |  users  |                    1  |  age_sex    |      1  |  age                  |                  192  |   |  users  |                    1  |  age_sex    |      2  |  sex                  |                  406  |   |  users  |                    1  |  sex_age    |      1  |  sex                  |                      2  |   |  users  |                    1  |  sex_age    |      2  |  age                  |                  406  |   |  users  |                    1  |  name          |      1  |  name                |              38149  |   |  users  |                    1  |  active      |      1  |  active            |                      2  |   +-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+
  • 27. Cardinality and distribution 27 Cardinality version 5.5 Cardinality version 5.6 count(distinct) Distribution Filter efficiency PRIMARY 10.000.267 9.728.224 10.000.000 única optimum name 18,868 38,149 10,001 ±0,01%  c/u very good age 18 192 101 ±1%  c/u good sex 18 2 2 50%  c/u fair active 18 2 2 ‘0’:    0,1%   ‘1’:  99,9% ‘0’ very good ‘1’ very bad
  • 30. EXPLAIN fields id Id of the SELECT, if there are more than one select_type Query type (SIMPLE, UNION, SUBQUERY…) table Table name type Record search strategy possible_keys Indexes to consider key Index used key_len Used length of the index ref Fields matched with the index rows Approximate record number to consider filtered Percentage of filtered records by WHERE Extra Additional info 30
  • 31. Search strategies from most to less optimal • const: just one record, by primary key or unique index • eq_ref: just one record for each other record in a JOIN • ref: multiple records filtering by index • index_merge: multiple records filtering by more than one index • range: multiple records filtering by range using index • index: read all records in the index file (index scan) • ALL: read all records in the data file (full scan)
  • 32. Information in the “Extra” field • Using where: records are filtered after reading them, using the WHERE condition (an index wasn’t able to filter all of them) • Using index: all field data is read directly from the index, without accessing the data file (covering index) • Using where, Using index: as with “Using where”, records are filtered after reading them, but data comes from an index • Using filesort: extra step to sort after filtering records, when records were not read in order from an index, using a temporary file • Using temporary: temporary tables are needed to complete some steps and satisfy the query (in memory or disk)
  • 33. Guide to understand EXPLAIN • “Using index” will boost query performance (covering index), specially with lots of results • Avoid “Using filesort” and “Using temporary”, specially with lots of results • “Using where” in queries of the types “ALL” o “index” means that is not possible to discard any record directly from the index, and they will be filtered while reading all the records • Watch how may records are going to be read approximately according to the field “rows” • Watch the effective used length of the index according to the field “key_len” 33
  • 34. Just one record by Primary Key mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats          -­‐>  WHERE  s_query  =  'account  executive'          -­‐>  AND  fk_i_id_tbl_type_dates  =  1          -­‐>  AND  d_date  =  '2009-­‐09-­‐28'G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  const   possible_keys:  PRIMARY,s_query_i_num_ads_salary_d_date,d_date                      key:  PRIMARY              key_len:  774                      ref:  const,const,const                    rows:  1 34
  • 35. Just one record by UNIQUE KEY mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  WHERE  s_query  =  'account  executive'          -­‐>  AND  fk_i_id_tbl_type_dates  =  1          -­‐>  AND  d_date  =  '2009-­‐09-­‐28'G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats_NEW                    type:  const   possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date                      key:  unique_key              key_len:  774                      ref:  const,const,const                    rows:  1 35
  • 36. JOIN of just one record mysql>  EXPLAIN  SELECT  *  FROM  tbl_users  LEFT  JOIN  tbl_countries          -­‐>  ON  tbl_users.fk_c_id_tbl_countries  =  tbl_countries.c_idG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_users                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  20986861   ***************************  2.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_countries                    type:  eq_ref   possible_keys:  PRIMARY                      key:  PRIMARY              key_len:  6                      ref:  trovit_global.tbl_users.fk_c_id_tbl_countries                    rows:  1 36
  • 37. Many records using a partial Unique Key mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  WHERE  s_query  =  'account  executive'G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats_NEW                    type:  ref   possible_keys:  unique_key,s_query_i_num_ads_salary_d_date                      key:  unique_key              key_len:  767                      ref:  const                    rows:  1073 37
  • 38. Many records using index mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats          -­‐>  WHERE  d_date  =  '2012-­‐10-­‐15'G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  ref   possible_keys:  d_date                      key:  d_date              key_len:  3                      ref:  const                    rows:  10908 38
  • 39. OR condition and indexes mysql>  EXPLAIN  SELECT  a  FROM  t          -­‐>  WHERE  b  =  1  OR  c  =  1G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  t                    type:  index_merge   possible_keys:  b,c,b_a                      key:  b,c              key_len:  4,4                      ref:  NULL                    rows:  4863128                  Extra:  Using  union(b,c);  Using  where 39
  • 40. Range query using index mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  WHERE  s_query  =  'account  executive’          -­‐>  AND  fk_i_id_tbl_type_dates  =  1          -­‐>  AND  d_date  <  ‘2011-­‐04-­‐25'G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_bl_trovit_stats                    type:  range   possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date                      key:  unique_key              key_len:  774                      ref:  NULL                    rows:  539 40
  • 41. Condition is not using index, but using covering index mysql>  EXPLAIN  SELECT  s_query  FROM  jobs_tbl_trovit_stats          -­‐>  WHERE  i_num_ads_salary  =  10G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  index   possible_keys:  NULL                      key:  s_query_i_num_ads_salary_d_date              key_len:  775                      ref:  NULL                    rows:  5543853                  Extra:  Using  where;  Using  index 41
  • 42. Full scan query, filter condition doesn’t use index mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_stats          -­‐>  WHERE  i_num_ads_salary  =  10G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  5543853                  Extra:  Using  where 42
  • 43. Full scan query, retrieve all records mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_statsG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  5543853                  Extra:  NULL 43
  • 44. EXPLAIN and the MySQL query optimizer 44
  • 45. 45 KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id   KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id   mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t          -­‐>  WHERE  a  =  1  ORDER  BY  idG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  t                    type:  ref   possible_keys:  a,a_b,a_b_c_d                      key:  a_b_c_d              key_len:  4                      ref:  const                    rows:  199704                  Extra:  Using  where;  Using  index;  Using  filesort   200000  rows  in  set  (0.28  sec) The importance of “using index” query optimizer choses wisely and favours the performance boost of “using index” even if it’s forced to “using filesort”
  • 46. 46 KEY  `a`  (`a`)  +  id   mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t          -­‐>  FORCE  KEY  (a)          -­‐>  WHERE  a  =  1  ORDER  BY  idG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  t                    type:  ref   possible_keys:  a                      key:  a              key_len:  4                      ref:  const                    rows:  199704                  Extra:  Using  where   200000  rows  in  set  (0.53  sec) The importance of “using index” if we choose to force an index we think it would be better for performance (in this case the bad extra “using filesort” is not used), we might be mistaken and the query could be slower
  • 47. 47 mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats          -­‐>  ORDER  BY  d_dateG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  jobs_tbl_trovit_stats                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  5543853                  Extra:  Using  filesort   5066959  rows  in  set  (3  min  27.96  sec) But sometimes query optimizer is wrong full scan query, it prefers to read only the data file and do a filesort, instead of using an index to first: read keys directly in order and second: access the data file to retrieve fields
  • 48. 48 mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats          -­‐>  FORCE  KEY  (d_date)          -­‐>  ORDER  BY  d_dateG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  homes_tbl_my_searches                    type:  index   possible_keys:  NULL                      key:  d_date              key_len:  3                      ref:  NULL                    rows:  5543853   5066959  rows  in  set  (18.11  sec) But sometimes query optimizer is wrong same query, but this time is an index scan, with millions of records, is much better to force the use of an index that satisfies the sorting of records, even if we must do a second read from the data file to retrieve the fields for every record
  • 50. Denormalize dates don’t use DATE_FORMAT in WHERE, GROUP BY or ORDER BY mysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,          -­‐>  DATE_FORMAT(d_date,  “%Y-­‐%m”)  FROM  tbl_publishers_stats          -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'          -­‐>  GROUP  BY  YEAR(d_date),  MONTH(d_date)G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_publishers_stats                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  323826                  Extra:  Using  where;  Using  temporary;  Using  filesort 50
  • 51. Denormalize dates use specific indexed fields for year, month, day, etc mysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,          -­‐>  i_year,  i_month  FROM  tbl_publishers_stats          -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'          -­‐>  GROUP  BY  i_year,  i_monthG   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_publishers_stats_NEW                    type:  ref   possible_keys:  publisher_year_month                      key:  publisher_year_month              key_len:  5                      ref:  const                    rows:  236                  Extra:  Using  where 51
  • 52. Avoid using “*” • Avoid using “*” in the field list of a SELECT • Put only the fields you need in order to save unneeded disk reads • Big extra optimisation if all the fields belong to the index used in the query (covering index) • Exception for using “*”: count(*) must always be used instead of count(field_name): • To avoid causing confusion to the query optimiser • Avoids future errors if the query is modified and the field inside the count() function is no longer indexed 52
  • 53. Page queries • Avoid running any long executing query • Long queries block the execution of MySQL internal maintenance processes, like the purge history growing several gigabytes that would never shrink again • Long INSERT, UPDATE and DELETE, in addition to the above, will delay replication to slaves • Use LIMIT in queries of any type to page in smaller and faster blocks
  • 54. Using LIMIT the right way • LIMIT filters final results of the query, only after WHERE, GROUP BY and ORDER BY are processed • Beware of GROUP BY and ORDER BY “using filesort”, all records are going to be file sorted before LIMIT could take effect • Even when using an index, the LIMIT <offset>,<row_count> syntax will run slower and slower as the offset increments 54
  • 55. Using LIMIT the right way always avoid “using filesort” 55 mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches          -­‐>  ORDER  BY  s_where          -­‐>  LIMIT  10G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  homes_tbl_my_searches                    type:  ALL   possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  746475                  Extra:  Using  filesort   10  rows  in  set  (4.00  sec)
  • 56. Using LIMIT the right way always avoid “using filesort” 56 mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches          -­‐>  ORDER  BY  s_what          -­‐>  LIMIT  10G   ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  homes_tbl_my_searches                    type:  index   possible_keys:  NULL                      key:  what_where              key_len:  NULL                      ref:  NULL                    rows:  10   10  rows  in  set  (0.00  sec)
  • 57. #  Slower  as  offset  increments   mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  LIMIT  5000000,100;   100  rows  in  set  (0.92  sec)        #  Always  fast   mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  WHERE  i_id  >  $last_id          -­‐>  LIMIT  100;   100  rows  in  set  (0.00  sec) Using LIMIT the right way • When paging, avoid LIMIT <offset>,<row_count> it’s better to filter using a primary or unique key 57
  • 58. LIMIT for long UPDATE and DELETE • Divide long UPDATE and DELETE queries in several shorter executions using LIMIT • If possible, use indexed fields to find records 58 #  Crontab  to  purge  old  records   Do   mysql>  DELETE  FROM  homes_tbl_my_searches          -­‐>  WHERE  dt_date  <  ‘2013-­‐12-­‐31’          -­‐>  LIMIT  1000;   While(rows  affected  >  0)   #  One-­‐time  UPDATE,  not  worth  to  create  an  index  just  for  this  time   Do   mysql>  UPDATE  homes_tbl_my_searches  SET  i_active  =  0          -­‐>  WHERE  i_active  !=  0  AND  s_what  =  ‘offensive  stopword’          -­‐>  LIMIT  1000;   While(rows  changed  >  0)