16. MySQLD
Many think that yum install / apt-get install
and they’re done.
• No, we’re just beginning!
• Logging
• Memory Usage:
• Global
• Session
• etc
21. MySQLD
( memory usage )
• Memory is allocated...
• Globally - server wide, usually allocated
only at server startup and exist only “once”
• Session - each connection or event
40. InnoDB
Many options only a few covered
• innodb_buffer_pool_size
- caches both indexes and data
- upwards of 85% available RAM
• innodb_flush_log_at_trx_commit
- 0 - written and flushed to disk once per second
- 1 - written and flushed to disk at each
transaction commit
- 2 - written to disk at each commit, flush one a
second
41. InnoDB
• innodb_log_buffer_size
- Amount of data written to ib_logfile
- The larger, the less disk I/O
• innodb_log_file_size
- The larger the file, the more InnoDB can do
- < 5.5 the larger, the longer crash recovery time
• innodb_double_write_buffer
- data integrity checking
• innodb_file_per_table
- data and indexes written to their own .ibd file
46. MySQLD
( in depth 2/11 )
Values with out timeframe are
meaningless.
10:21:08 rderoo@mysql09:mysql [1169]> SHOW GLOBAL STATUS LIKE 'Uptime';
+---------------+----------+
| Variable_name | Value |
+---------------+----------+
| Uptime | 12973903 |
+---------------+----------+
1 row in set (0.00 sec)
That’s about 150 days. :)
47. MySQLD
( in depth 3/11 )
By using mysqladmin we can observe
changes over short periods of time:
$ mysqladmin -u rderoo -p -c 2 -i 10 -r extended-status > review.txt
Output is the same as SHOW GLOBAL STATUS
with two copies appearing in review.txt
the first has values since the last server
restart / FLUSH COUNTERS the second output
are the delta between the first and 10 seconds
later.
48. MySQLD
( in depth 4/11 )
Temporary Tables:
How rderoo@mysql09:event [1185]> SHOW GLOBAL VARIABLES LIKE
17:54:33
big?
'%_table_size';
+---------------------+----------+
| Variable_name | Value |
+---------------------+----------+
| max_heap_table_size | 67108864 |
| tmp_table_size | 67108864 |
+---------------------+----------+
2 rows in set (0.00 sec)
How many?
17:55:09 rderoo@mysql09:event [1186]> SHOW GLOBAL STATUS LIKE '%_tmp
%tables';
+-------------------------+-----------+
| Variable_name | Value |
+-------------------------+-----------+
| Created_tmp_disk_tables | 156 |
| Created_tmp_tables | 278190736 |
+-------------------------+-----------+
2 rows in set (0.00 sec)
49. MySQLD
( in depth 6/11 )
open_table_cache
| Open_tables | 4094 |
| Opened_tables | 12639 |
| Uptime | 12980297 |
• This yields a rate of ~85 table opens per day, a bit high...
20:52:22 rderoo@mysql09:mysql [1190]> SHOW GLOBAL VARIABLES LIKE
'table_cache';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| table_cache | 4096 |
+---------------+-------+
• Increasing the table_cache would be advisable.
50. MySQLD
( in depth 7/11 )
thread_cache_size
• To determine the effectiveness of the thread_cache
as a hit ratio use the following formula:
100-((Threads_created/Connections)*100)
| Connections | 131877233 |
| Threads_created | 20014624 |
100 - ( ( 20014624 / 131877233 ) * 100 ) = 84.8%
51. MySQLD
( in depth 8/11 )
read_buffer_size
• The effectiveness can be found by examining the
Select_scan :
| Select_scan | 304864263 |
• This should be set to a small value globally and
increased on a per session basis
• Increase when performing full table scans
52. MySQLD
( in depth 9/11)
read_rnd_buffer_size
• Used when reading sorted results after a key sort has
taken place
• This should be set to a small value globally and
increased on a per session basis
• Usually should be set ( per session ) no high than 8M
53. MySQLD
( in depth 10/11)
sort_buffer_size
• The effectiveness can be found by examining the
Sort_merge_passes :
| Sort_merge_passes | 7 |
• This should be set to a small value globally and
increased on a per session basis
• Increase GROUP BY, ORDER BY, SELECT
DISTINCT and UNION DISTINCT performance
54. MySQLD
( in depth 11/11 )
join_buffer_size
• The effectiveness can be found by examining the
Select_full_join :
| Select_full_join | 5369 |
• A crutch used when proper indexing is not in place
• Will help in cases of full table scans
• This should be set to a small value globally and
increased on a per session basis
• Increase GROUP BY and ORDER BY performance
56. Tuning your queries is the single biggest
performance improvement you can do to
your server
• efficient resource usage
• results returned quicker
Per QUERY!
61. Slow query log
• A log of all the “slow” queries
• Define “slow”
• took longer then long_query_time
to run AND at least
min_examined_row_limit rows
are looked at
• log_queries_not_using_indexes
• log_slow_admin_statements
62. Turning it on
• Option is slow_query_log
• no argument or value of 1 - enabled
• argument of 0 - disabled (default)
63. Output Types
• Use log_output
• write to a file
• write to a table in the mysql database
64. slow log at startup
• Place the slow_query_log and
optional log_output options in
my.cnf
• log_output acceptable values
• TABLE, FILE, NONE
• Can also use slow_query_log_file
to specify a log file name
65. slow log at runtime
• All the below global variables can be
changed at run time
• log_output
• slow_query_log
• slow_query_log_file
SET GLOBAL slow_query_log = 1;
66. /usr/sbin/mysqld, Version: 5.0.67-0ubuntu6.1-log ((Ubuntu)). started with:
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
Time Id Command Argument
# Time: 110328 13:15:05
# User@Host: homer[homer] @ localhost []
# Query_time: 3 Lock_time: 0 Rows_sent: 0 Rows_examined: 4452238
use confs;
select
distinct u.ID as user_id,
t.event_id,
u.username,
u.full_name
from
user u,
talks t,
talk_speaker ts
where
u.ID <> 6198 and
u.ID = ts.speaker_id and
t.ID = ts.talk_id and
t.event_id in (
select
distinct t.event_id
from
talk_speaker ts,
talks t
where
ts.speaker_id = 6198 and
t.ID = ts.talk_id
)
order by rand()
limit 15;
68. There are multiple ways of getting
information out of the slow query log.
• just open it up to see the raw data
• usually more helpful to summarize/
aggregate the raw data
69. Tools for aggregating
the slow log data
• mk-query-digest
• third party
• part of the maatkit utilities
• mysqldumpslow
• comes with MySQL in the bin dir
70. Macintosh-7:joindin-for-lig ligaya$ perl -laF ~/mysql_installs/maatkit/mk-
query-digest mysql-slow.log
# 23s user time, 240ms system time, 17.53M rss, 86.51M vsz
# Current date: Mon May 16 11:08:24 2011
# Hostname: Macintosh-7.local
# Files: mysql-slow.log
# Overall: 73.72k total, 14 unique, 0.02 QPS, 0.04x concurrency __________
# Time range: 2011-03-28 13:15:05 to 2011-05-13 02:33:52
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 160130s 0 180s 2s 3s 819ms 2s
# Lock time 11s 0 1s 149us 0 12ms 0
# Rows sent 387.64k 0 255 5.38 14.52 6.86 0
# Rows examine 338.66G 0 5.35M 4.70M 4.93M 387.82k 4.70M
# Query size 30.19M 17 8.35k 429.47 420.77 29.33 420.77
# Profile
# Rank Query ID Response time Calls R/Call Apdx V/M Item
# ==== ================== ================= ===== ======= ==== ===== =====
# 1 0x2AAA5EC420D7E2A2 159878.0000 99.8% 73682 2.1698 0.50 0.12 SELECT
user talks talk_speaker talks
# 3 0x0A2584C03C8614A9 50.0000 0.0% 16 3.1250 0.44 0.87 SELECT
wp_comments
# MISC 0xMISC 202.0000 0.1% 20 10.1000 NS 0.0 <12
ITEMS>
71. Count: 1 Time=180.00s (180s) Lock=0.00s (0s) Rows=1.0 (1), XXXXX[xxxxx]@localhost
SELECT SLEEP(N)
Count: 16 Time=3.12s (50s) Lock=0.00s (0s) Rows=0.9 (14), XXXXX[xxxxx]@localhost
SELECT comment_date_gmt FROM wp_comments WHERE comment_author_IP = 'S' OR
comment_author_email = 'S' ORDER BY comment_date DESC LIMIT N
Count: 5 Time=2.80s (14s) Lock=0.00s (0s) Rows=0.0 (0), XXXXX[xxxxx]@localhost
SELECT comment_ID FROM wp_comments WHERE comment_post_ID = 'S' AND ( comment_author
= 'S' OR comment_author_email = 'S' ) AND comment_content = 'S' LIMIT N
Count: 73682 Time=2.17s (159867s) Lock=0.00s (11s) Rows=5.4 (396223),
XXXXX[xxxxx]@localhost
select
distinct u.ID as user_id,
t.event_id,
u.username,
u.full_name
from
user u,
talks t,
talk_speaker ts
where
u.ID <> N and
u.ID = ts.speaker_id and
t.ID = ts.talk_id and
t.event_id in (
select
distinct t.event_id
from
talk_speaker ts,
talks t
where
ts.speaker_id = N and
t.ID = ts.talk_id
)
order by rand()
limit N
72. Tuning a query
• Tables
• SHOW CREATE TABLE `talks`;
• SHOW TABLE STATUS LIKE
‘talks’;
• Indexes
• SHOW INDEXES FROM ‘talks’;
• EXPLAIN
74. EXPLAIN Basics
• Syntax: EXPLAIN [EXTENDED]
SELECT select_options
• Displays information from the
optimizer about the query execution
plan
• Works only with SELECT statements
75. EXPLAIN Output
• Each row provides information on one
table
• Output columns:
id key_length
select_type ref
table rows
type Filtered (new to 5.1)
possible_keys Extra
key
77. id
• Select identifier
• Only if there are subqueries, derived
tables or unions is this incremented
• Number reflects the order that the
SELECT/FROM was done in
82. key
• Index used for the query or NULL
• Look at key and possible_keys to
consider your index strategy
83. key_len
• shows number of bytes MySQL will use
from the index
• Possible that only part of the index will
be used
NOTE: a character set may use more then
one byte per character
84. ref
• Very different then the access type ‘ref’
• Show which columns or a constant
within the index will be used for the
access
85. rows
• Number of rows MySQL expects to find
based on statistics
• statistics can be updated with
ANALYZE TABLE
86. filtered
• New in 5.1.12
• Estimated number of rows filtered by
the table condition
• Shows up if you use EXPLAIN
EXTENDED
87. Extra
• All the other stuff
• Using index - using a covering index
• Using filesort - sorting in a temporary
table
• Using temporary - a temporary table
was made.
• Using where - filtering outside the
storage engine.
89. Query we are working
with:
EXPLAIN select distinct u.ID as user_id,
t.event_id,
u.username,
u.full_name
from user u,
talks t,
talk_speaker ts
where u.ID <> 11945 and
u.ID = ts.speaker_id and
t.ID = ts.talk_id and
t.event_id in (
select distinct t.event_id
from talk_speaker ts,
talks t
where ts.speaker_id = 11945 and
t.ID = ts.talk_id
)
order by rand()
limit 15G
90. Full EXPLAIN plan
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: t
type: index
possible_keys: PRIMARY
key: idx_event
key_len: 9
ref: NULL
rows: 3119
Extra: Using where; Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: ts
type: ref
possible_keys: talk_id
key: talk_id
key_len: 5
ref: joindin.t.ID
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: PRIMARY
table: u
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: joindin.ts.speaker_id
rows: 1
Extra:
91. Full EXPLAIN plan
(con’t)
*************************** 4. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: t
type: ref
possible_keys: PRIMARY,idx_event
key: idx_event
key_len: 5
ref: func
rows: 24
Extra: Using where; Using index; Using temporary
*************************** 5. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: ts
type: ref
possible_keys: talk_id
key: talk_id
key_len: 5
ref: joindin.t.ID
rows: 1
Extra: Using where
5 rows in set (0.00 sec)
93. Hint
EXPLAIN select distinct u.ID as user_id, *************************** 4. row ***********************
t.event_id, id: 2
u.username, select_type: DEPENDENT SUBQUERY
u.full_name table: t
from user u, type: ref
talks t, possible_keys: PRIMARY,idx_event
talk_speaker ts key: idx_event
where u.ID <> 11945 and key_len: 5
u.ID = ts.speaker_id and ref: func
t.ID = ts.talk_id and rows: 24
t.event_id in ( Extra: Using where; Using index; Using temporary
select distinct t.event_id *************************** 5. row **********************
from talk_speaker ts, id: 2
talks t select_type: DEPENDENT SUBQUERY
where ts.speaker_id = 11945 table: ts
and type: ref
t.ID = ts.talk_id possible_keys: talk_id
) key: talk_id
order by rand() key_len: 5
limit 15G ref: joindin.t.ID
rows: 1
Extra: Using where
5 rows in set (0.00 sec)
94. Solution
1:
SELECT DISTINCT u.ID as user_id,
t.event_id,
u.username,
u.full_name
FROM user u
JOIN talk_speaker ts
ON u.ID = ts.speaker_id
JOIN talks t
ON t.ID = ts.talk_id
JOIN
-- find the events the speaker has talks in
(
SELECT t1.event_id
FROM talk_speaker ts1
JOIN talks t1
ON t1.ID = ts1.talk_id
WHERE ts1.speaker_id = 11945
) as e
ON t.event_id = e.event_id
WHERE u.ID <> 11945
LIMIT 15G
95. Solution
1:
SELECT DISTINCT u.ID as user_id,
t.event_id,
u.username,
u.full_name
FROM user u
JOIN talk_speaker ts
ON u.ID = ts.speaker_id
JOIN talks t
ON t.ID = ts.talk_id
JOIN
-- find the events the speaker has talks in
(
SELECT t1.event_id
FROM talk_speaker ts1
JOIN talks t1
ON t1.ID = ts1.talk_id
WHERE ts1.speaker_id = 11945
) as e
ON t.event_id = e.event_id
WHERE u.ID <> 11945
LIMIT 15G
96. Solution
1:
SELECT DISTINCT u.ID as user_id,
t.event_id,
u.username,
u.full_name
FROM user u
JOIN talk_speaker ts
ON u.ID = ts.speaker_id
JOIN talks t
ON t.ID = ts.talk_id
JOIN
-- find the events the speaker has talks in
(
SELECT t1.event_id
FROM talk_speaker ts1
JOIN talks t1
ON t1.ID = ts1.talk_id
WHERE ts1.speaker_id = 11945
) as e
ON t.event_id = e.event_id
WHERE u.ID <> 11945
LIMIT 15G
97. Solution
1: reminder:
CREATE TABLE `talk_speaker` (
SELECT DISTINCT u.ID as user_id, `talk_id` int(11) DEFAULT NULL,
t.event_id, `speaker_name` varchar(200) DEFAULT NULL,
u.username, `ID` int(11) NOT NULL AUTO_INCREMENT,
u.full_name `speaker_id` int(11) DEFAULT NULL,
FROM user u `status` varchar(10) DEFAULT NULL,
JOIN talk_speaker ts PRIMARY KEY (`ID`),
ON u.ID = ts.speaker_id KEY `talk_id` (`talk_id`) USING BTREE
JOIN talks t ) ENGINE=MyISAM AUTO_INCREMENT=3763
ON t.ID = ts.talk_id DEFAULT CHARSET=utf8;
JOIN
-- find the events the speaker has talks in
(
SELECT t1.event_id
FROM talk_speaker ts1
JOIN talks t1
ON t1.ID = ts1.talk_id
WHERE ts1.speaker_id = 11945
) as e
ON t.event_id = e.event_id
WHERE u.ID <> 11945
LIMIT 15G
98. Solution
1: reminder:
CREATE TABLE `talk_speaker` (
SELECT DISTINCT u.ID as user_id, `talk_id` int(11) DEFAULT NULL,
t.event_id, `speaker_name` varchar(200) DEFAULT NULL,
u.username, `ID` int(11) NOT NULL AUTO_INCREMENT,
u.full_name `speaker_id` int(11) DEFAULT NULL,
FROM user u `status` varchar(10) DEFAULT NULL,
JOIN talk_speaker ts PRIMARY KEY (`ID`),
ON u.ID = ts.speaker_id KEY `talk_id` (`talk_id`) USING BTREE
JOIN talks t ) ENGINE=MyISAM AUTO_INCREMENT=3763
ON t.ID = ts.talk_id DEFAULT CHARSET=utf8;
JOIN
-- find the events the speaker has talks in
(
SELECT t1.event_id 2:
FROM talk_speaker ts1 ALTER TABLE `talk_speaker`
JOIN talks t1 ADD INDEX (speaker_id);
ON t1.ID = ts1.talk_id
WHERE ts1.speaker_id = 11945 OR
) as e ALTER TABLE `talk_speaker`
ON t.event_id = e.event_id ADD INDEX (speaker_id, talk_id);
WHERE u.ID <> 11945
LIMIT 15G
Depending on the context, good computer performance may involve one or more of the following:\nShort response time\nHigh throughput (rate of processing work)\nLow utilization of computing resource(s)\nHigh availability\nHigh bandwidth / short data transmission time\n\nhttp://en.wikipedia.org/wiki/Performance_%28Computer%29\n
So the first thing you need to do is define what exactly you want to work on to improve. \n\nWe obviously will be working on improving the performance of the database. But because things cross over a lot we will be talking about more then just the MySQL server along the way. \n
Some are easier to use then others.\nA common tool used in the MySQL world is sysbench. I have also used mysqlslap for fast testing of load.\n
\n
- Generally speaking MySQL plays better on Linux... however there has been a big push on improving how we work on Windows. So if that is your preferred system, there is no reason to not use it. \n- there is no real difference between distros. But... you want to use the latest version from mysql.com, not what is in the distro repositories. There have been problems with distros placing files in odd places, adding my.cnf files or altering scripts - that can be problematic (Ex: debian altering the start script to check all tables - which can take a lot of time if you have a large number of tables - http://www.mysqlperformanceblog.com/2009/01/28/the-perils-of-innodb-with-debian-and-startup-scripts/)\n\n
CAVEOT: These setting are for a dedicated DB server\n- cfq (completely fair queuing) - should not be used for a database but would be appropriate for a machine that has the full LAMP stack on it.\n- Noop scheduler (noop) is the simplest I/O scheduler for the Linux kernel based upon FIFO queue concept (may reorder but does not wait). Only merges requests.\n- Deadline scheduler (deadline) - it attempt to guarantee a start service time for a request. Prefers readers &#x2013; as long as the deadline for a write request hasn't passed.\n\nIO Scheduler - controls the way the kernel commits reads and writes to the disks. Main goal is to optimise disk access times.\n\nhttp://www.cyberciti.biz/faq/linux-change-io-scheduler-for-harddisk/\nhttp://dom.as/2008/02/05/linux-io-schedulers/\nhttp://www.wlug.org.nz/LinuxIoScheduler\n
- Generally speaking with database servers you want to have a much RAM as you can. Optimally you want to be able to fit the data into the RAM. Lots of RAM is good. If you can&#x2019;t fit all the data into RAM, then you want fast disks since you will be reading data in and out of them as you need it.\n- RAID 10, battery backed write cache\n- Since MySQL is single process, multi-threaded, the general advice is that you want faster CPUs rather then more of them. More is not necessarily better then faster. This however is changing from 5.1 + the plugin and 5.5\n
there are a couple of basic Linux commands that you want to become familiar with (in terms of performance).\n\n- top helps you watch how things are going at a pretty general level. CPU, load, memory\n- iostat is good to watch to see how your IO is. r/s, w/s, %await (how long it took to respond to requests), %svctm (how long the request actually took), %util (utilization)\n- vmstat is good to see how our memory, swap, processes and CPU is being used (amongst other things).\n\nIf you are not already familiar with these commands, you need to learn about them.\n\ndstat is a python script that combines top, iostat and vmstat (apt-get/yum install dstat)\n
\n
\n
\n
\n
- binary log should always be turned on. Yes it is most closely associated with replication, but it also should be used for backups and to do point in time recovery.\n- Setting the path so the general log *can* be turned on, but having it disabled at start can be a huge win in development/troubleshooting time. General query log - logs queries as the come into the server.\n- Slow logging should be on, always. And long_query_time - which now supports microseconds - at a high value in the beginning.\n- &#x201C;not_using&#x201D; should only be used with &#x201C;examined_limit&#x201D;. This helps when you have small tables with only a small number of rows.\n
- binary log should always be turned on. Yes it is most closely associated with replication, but it also should be used for backups and to do point in time recovery.\n- Setting the path so the general log *can* be turned on, but having it disabled at start can be a huge win in development/troubleshooting time. General query log - logs queries as the come into the server.\n- Slow logging should be on, always. And long_query_time - which now supports microseconds - at a high value in the beginning.\n- &#x201C;not_using&#x201D; should only be used with &#x201C;examined_limit&#x201D;. This helps when you have small tables with only a small number of rows.\n
- binary log should always be turned on. Yes it is most closely associated with replication, but it also should be used for backups and to do point in time recovery.\n- Setting the path so the general log *can* be turned on, but having it disabled at start can be a huge win in development/troubleshooting time. General query log - logs queries as the come into the server.\n- Slow logging should be on, always. And long_query_time - which now supports microseconds - at a high value in the beginning.\n- &#x201C;not_using&#x201D; should only be used with &#x201C;examined_limit&#x201D;. This helps when you have small tables with only a small number of rows.\n
- &#x201C;once&#x201D; in that there is only one key_buffer_pool, but there can be many named buffer pools.\n- Be careful wish per session size 2m join_buffer with 1k connections is 2Gb\n
- connect_timeout: number of seconds that the mysqld server waits for a connect packet before responding with &#x201C;Bad handshake&#x201D;.\n- wait_timeout = number of seconds the server waits for activity on a noninteractive connection before closing it. As non-intuitive as it seems, most PHP connections will not be considered interactive connections. So we do not want connections hanging around taking up resources longer then they have to.\n- expire_log_days: make sure you have a backup before you expire the logs\n\n
- connect_timeout: number of seconds that the mysqld server waits for a connect packet before responding with &#x201C;Bad handshake&#x201D;.\n- wait_timeout = number of seconds the server waits for activity on a noninteractive connection before closing it. As non-intuitive as it seems, most PHP connections will not be considered interactive connections. So we do not want connections hanging around taking up resources longer then they have to.\n- expire_log_days: make sure you have a backup before you expire the logs\n\n
- connect_timeout: number of seconds that the mysqld server waits for a connect packet before responding with &#x201C;Bad handshake&#x201D;.\n- wait_timeout = number of seconds the server waits for activity on a noninteractive connection before closing it. As non-intuitive as it seems, most PHP connections will not be considered interactive connections. So we do not want connections hanging around taking up resources longer then they have to.\n- expire_log_days: make sure you have a backup before you expire the logs\n\n
- query_cache_type: Yes we want you to completely turn it off. Historically the query cache has been used for small sites or to help out until you can implement higher lever caching (memcached for example). Talk about how the query cache works.\n \nExplain if they are going to use the query cache it is best if they use ON DEMAND (2) - to dictate which queries to cache. End with explaining that ultimately caching is best done outside the database to allow the database to do what it does best, store and retrieve the data.\n- query_cache_size : will be allocated even if the query_cache_type = 0\n\n\n
- query_cache_type: Yes we want you to completely turn it off. Historically the query cache has been used for small sites or to help out until you can implement higher lever caching (memcached for example). Talk about how the query cache works.\n \nExplain if they are going to use the query cache it is best if they use ON DEMAND (2) - to dictate which queries to cache. End with explaining that ultimately caching is best done outside the database to allow the database to do what it does best, store and retrieve the data.\n- query_cache_size : will be allocated even if the query_cache_type = 0\n\n\n
- SHOW GLOBAL STATUS LIKE &#x2018;Opened_tables&#x2019;\n- thread_cache_size can be very helpful for PHP apps, nearly useless for pool managers\n- skip_name_resolve: Do not resolve host names when checking client connections. Use only IP addresses. This removes DNS from the connection process, where it tends to be the slowest thing. \n- performance_schema: This is either on or off. Most developers will have no need of the information provided in the performance schema. The information available within the performance schema is at a low level (mutexes, spins, system calls, function calls).\n
- SHOW GLOBAL STATUS LIKE &#x2018;Opened_tables&#x2019;\n- thread_cache_size can be very helpful for PHP apps, nearly useless for pool managers\n- skip_name_resolve: Do not resolve host names when checking client connections. Use only IP addresses. This removes DNS from the connection process, where it tends to be the slowest thing. \n- performance_schema: This is either on or off. Most developers will have no need of the information provided in the performance schema. The information available within the performance schema is at a low level (mutexes, spins, system calls, function calls).\n
- SHOW GLOBAL STATUS LIKE &#x2018;Opened_tables&#x2019;\n- thread_cache_size can be very helpful for PHP apps, nearly useless for pool managers\n- skip_name_resolve: Do not resolve host names when checking client connections. Use only IP addresses. This removes DNS from the connection process, where it tends to be the slowest thing. \n- performance_schema: This is either on or off. Most developers will have no need of the information provided in the performance schema. The information available within the performance schema is at a low level (mutexes, spins, system calls, function calls).\n
- SHOW GLOBAL STATUS LIKE &#x2018;Opened_tables&#x2019;\n- thread_cache_size can be very helpful for PHP apps, nearly useless for pool managers\n- skip_name_resolve: Do not resolve host names when checking client connections. Use only IP addresses. This removes DNS from the connection process, where it tends to be the slowest thing. \n- performance_schema: This is either on or off. Most developers will have no need of the information provided in the performance schema. The information available within the performance schema is at a low level (mutexes, spins, system calls, function calls).\n
- *_buffer_size - Noted here in the global scope, but should be controlled on a per sessions basis. Global values should be small.\n - these I will get into more detail later. \n\n\n
This is no longer the default engine. 5.5 innodb is the default. There is a reason for that. If you can - you want to use InnoDB tables rather then MyISAM. Admittedly there are situations where you may be stuck using MyISAM (spacial data for example), but that is not the standard user. Oh and for those that argue that MyISAM has full text indexing... you should really be looking at alternate solutions (like sphinx). While we do work, we are not nearly as fast or robust. Just sayin&#x2019;\n\nThis information is provided in case you can not, for what ever reason, move to InnoDB. This is not to encourage you to use the MyISAM engine.\n
\n
\n
\n
\n
\n
\n
\n
- used to hold the MyISAM index blocks. The data is held in the OS file system cache. 25% of the RAM would be a lot.\n- max size is 4G. But it does allow for multiple key caches if you need to go higher then that or if you have too much contention for the control structures that manage access to the key cache buffers.\n- formulas show how efficient the read and writes are. \nKey_*_requests are total numbers asking to do X. Key_* are the actual physical number of times it went to disk to do X. \n
MyISAM uses table level locking.\n- concurrent_insert: to reduce contention between readers and writers for a given table: If a MyISAM table has no holes in the data file (deleted rows in the middle), an INSERT statement can be executed to add rows to the end of the table at the same time that SELECT statements are reading rows from the table. If there are multiple INSERT statements, they are queued and performed in sequence, concurrently with the SELECT statements. \n0 - never allow concurrent inserts\n1 - only if there are no holes in the data will we allow concurrent inserts\n2 - For a table with a hole, new rows are inserted at the end of the table if the table is in use by another thread. Otherwise, MySQL acquires a normal write lock and inserts the row into the hole. \nIf 2 is used, then OPTIMIZE TALBE should be run more frequently - which means the table will be locked while it runs.\n
\n
With larger buffer pool sizes, the reallocation of the space can take a while. Also you want it large enough to handle your data, but not waste space (Innodb_buffer_pool_pages_free consistently available)\n\n\nflush log at trx - only have at 0 or 2 if you have a battery backed write cache (removes the ACID compliance since it moves the durability from the DB to the hardware)\n
double write buffer - leave on unless you can control that with your battery backup write unit\nfile per table - allows you to more easily reclaim unused space back to the OS. Ask if people understand how the shared tablespace works in terms of free space. Then explain how you reclaim any unused space (dump, stop server, remove ibdata & ib_log & frm files, restart, reload). file per table can regain the lost space with an OPTIMIZE.\n\nIf you have very large transactions, you need to pay attention to your log buffer size and your log file size. You want to be able hold at least a single transaction in the log buffer size, and hopefully a few transactions within the log file size.\n
innodb buffer pool hit ratio = 1 - (innodb_buffer_pool_reads / innodb_buffer_pool_read_requests)\n
Ok - lets now look at a couple of examples\n
These are only to get you going in the right direction of the ball park\n\nAlso:\n1. Maatkit's mk-variable-advisor - http://www.maatkit.org/doc/mk-variable-advisor.html\n2.Sheeri Cabral's fork of mysqltuner - https://launchpad.net/mysqltuner/\n3.The original mysqltuner - http://blog.mysqltuner.com/\n\n
\n
\n
\n
- GROUP BY / ORDER BY will create tmp_tables\n- Remind: that if max_heap < tmp_table then tmp_table is still limited to max_heap\n\nThe max temporary table size is 64mb.\nIn 150 days 278 million temp tables where created.\nWith only 156 going to disk.\nThis is well tuned!\n\nCould we have actually reduced the value? Yes but after some testing on the server in question, we found a significant increase in the number of temporary tables going to disk at about 45MB. We could have worked to further refine things to find the &#x201C;sweet spot&#x201D;, but in this case we chose to not do it. In this case the time that would have been used to find the perfect point was not worth it. 64MB is good enough for now.\n
We are basically at max table_cache, so bump it up some.\n
- Knowing about your environment/workload cycle is key. Generally speaking you have lulls in activity and peaks in activity. I personally prefer my thread cache to hit around the 80% mark. \n
Each request that performs a sequential scan of a table allocates a read buffer (variable read_buffer_size).\n\n
- Increase when using ORDER BY and GROUP BY / ORDER BY\n
\n
\n
\n
efficient resource usage \n - less IO since we are working with indexes rather then all the data. Potentially fitting all the relevant data in RAM.\n - less CPU - again because we are using indexes\n - since things are supposedly happening faster - we are holding any locks for less time\nAll this allows us to potentially have more concurrency with the same response times.\n
\n
\n
- entries not added until after they are executed and all locks released\n- time to acquire locks not counted in execution time\n- long_query_time can be microseconds as of 5.1.21 \n
- entries not added until after they are executed and all locks released\n- time to acquire locks not counted in execution time\n- long_query_time can be microseconds as of 5.1.21 \n
log_slow_query is deprecated and should not be used.\n
- default is file\n- microsecond time for long_query_time is only supported when we are going to a file. It is ignored when logging to tables.\n- logging to a table uses significantly more server overhead. However the table option does provide you with the power of the SQL query language - be careful though. There are limitations on what you can and can&#x2019;t do with the table. If you are thinking of using the table format, be sure to look over what can and can not be done in the manual[1]. \n- table uses the CVS engine so you can open the data in anything that can read that.\n\n[1] http://dev.mysql.com/doc/refman/5.5/en/log-destinations.html\n
- NONE if present overrides everything else\n- default is file if no value given\n- multiple values can be given, separated by a comma\n- if no log file name and path is given it will default to host_name-slow.log in the data directory\n
Can turn on the slow query log dynamically - no restarted needed. Possible to turn it on during a time when you have repeated problems for example. \n
This is the output of the slow query log when you just look at it - in its raw form. I just opened up the .log file and looked at what was inside. Queries are not formatted automatically for you. \n\nPlease note the time, the query itself, the Query_time, the lock time and rows_examined\n
\n
\n
- groups queries that are similar except for the particular values of number and string data values. It &#x201C;abstracts&#x201D; these values to N and 'S' when displaying summary output.\n
general summary information from the file. After this come information on the specific queries.\n
By default mysqldumpslow sorts by average query time. You can also sort by lock time, rows sent, or the count. We are going to work on the query with the highest count.\n
In order to tune a query we needs a number of things. Each of these things provide you with information to help you out.\n\nCREATE TABLE - structure of the table - indexes available, datatypes, storage engine\nTABLE STATUS - number of rows, storage engine, collation, create options, comments\nSHOW INDEXES - key name, sequence in the index, collation, cardinality, index type\n\nEXPLAIN - the execution plan. We will go into greater detail of this since the information from this tells you how MySQL will handle the query. Used with the other information, you can then work to optimize this.\n\n
\n
Extended works with SHOW WARNINGS\nUPDATE/DELETE can be also written as SELECTs that work with EXPLAIN. \n
\n
\n
Derived table is a subquery in the FROM clause that created a temporary table \n\n
SIMPLE - Normal\nPRIMARY - Outer SELECT\nDERIVED - subquery in the FROM clause that makes a temp table\nUNION - self - explanitory\nSUBQUERY - not in a FROM clause\n(UNION and SUBQUERY can have DEPENDANT if they are linked to the outer query)\n
point 2 - regardless of what order they are in for the query\npoint 3 - there are no indexes on derived tables. \n\nNOTE: it may be better to explicitly make a temp table and index it then use a derived table\n\n
system/const - only one row from MEMORY table or PRIMARY index\neq_ref - index lookup with one row returned\nref - similar to eq-ref but can return more then 1 row\nref_or_null - similar to ref but allowes null values or conditions\nindex_merge - allows you to use 2 indexes on 1 table\nrange - for range values (<, >, <=, >=, LIKE, IN, BETWEEN)\nindex - full index scan\nALL - full table scan\n
\n
If you have a list of possible keys but the key chosen is NULL - it may be that you have hit the threshold where it is better to do the sequential scan rather then the random IO for the amount of data to be read\n
- range types are known to potentially only use part of an index\na varchar(32) can be seen as 96 bytes because of the UNICODE \n
Ex: what column a join is being done on\n
each engine updates its own statistics with varying levels of accuracy \n\n
\n
Seeing values here is not by definition a &#x201C;bad thing&#x201D;. \n\nfilesort - does not mean it has to be on disk - it can be in RAM\nusing WHERE - example: \n1) if a column you are restricting on can be null. That would potentially be filtered then at the server level\n2) Where condition includes a restriction that can&#x2019;t be done through the index used\n
\n
ambiguous table aliases\nkeywords not in upper\ntheta join instead of ANSI\nsubquery\nORDER BY rand()\ndistinct in the inner query - is it really needed?\n
For the sample query.\n
\n
Steps that we took to optimize query.\n
Low hanging fruit.\nWe have dependent subqueries... and to confuse the situation even more we do not know if it is a correlated subquery since we use the same table aliases within it and without. This is bad!!!\n
We are moving the subquery from the WHERE clause to the FROM clause. With the subquery in the WHERE clause we potentially will cause the subquery to be run for each matching record ( Rows_examined in slow query log was ~4.5 million... OUCH). By moving it to the FROM clause, we will only be running the subquery once. \n\nI am also removing the ORDER BY rand(). I want to do this for a couple of reasons including the fact it will have to generate a random number for each of the ~4.5 rows (keep in mind collisions). This is a lot of processing that has to be done just to get 15 rows.\n\nI can get away with removing the ORDER BY RAND() in this specific case because of the derived table. It will be semi-randomizing the data with the JOINing, which in this case is adequate for our needs. This may or may not be something you can do in your queries. If it isn&#x2019;t, google on it. In my research I found a number of interesting solutions you can explore to see what may work best in your situation. \n
But now we are using the speaker_id column for JOINing and querying on.\n\n\n
But now we are using the speaker_id column for JOINing and querying on. We need to make sure these are indexed.\n\n
Looking at the CREATE table talk_speaker - the speaker_id does not seem to be indexed. So we need to add an index for it. \n\n\n
Looking at the CREATE table talk_speaker - the speaker_id does not seem to be indexed. So we need to add an index for it. We do that with an ALTER statement. \n\nAgain we can handle this a couple of ways. We can just add the speaker_id index, or we can see if a composite index of speaker_id and talk_id would be more beneficial. This would have to be tested to see which index helps us out more.\n\nThis is just a single example of what can be done with a query. With all these changes we would want to again look at the EXPLAIN plan of the new query to make sure that things are happening as we desire. If they are, we would then want to review the EXPLAIN plan to see what (if anything) can additionally be done to speed things up. We could work on tuning the subquery itself, or test to see if it would be more preferment to explicitly create a temporary table, index it, and then JOIN that into the outer query (derived tables do not have indexes).\n\n