SlideShare ist ein Scribd-Unternehmen logo
1 von 62
Downloaden Sie, um offline zu lesen
Parallel Replication in MySQL 5.7 and 8.0
by Booking.com
Presented at Pre-FOSDEM MySQL Day on Friday February 2nd, 2018
Eduardo Ortega (MySQL Database Engineer)
eduardo DTA ortega AT booking.com
Jean-François Gagné (System Engineer)
jeanfrancois DOT gagne AT booking.com
2
Booking.com
● Based in Amsterdam since 1996
● Online Hotel and Accommodation (Travel) Agent (OTA):
● +1.636.000 properties in 229 countries
● +1.555.000 room nights reserved daily
● +40 languages (website and customer service)
● +15.000 people working in 198 offices worldwide
● Part of the Priceline Group
● And we use MySQL:
● Thousands (1000s) of servers
3
Booking.com’
● And we are hiring !
● MySQL Engineer / DBA
● System Administrator
● System Engineer
● Site Reliability Engineer
● Developer / Designer
● Technical Team Lead
● Product Owner
● Data Scientist
● And many more…
● https://workingatbooking.com/ 4
Session Summary
1. Introducing Parallel Replication (// Replication)
2. MySQL 5.7: Logical Clock and Intervals
3. MySQL 5.7: Tuning Intervals
4. Write Set in MySQL 8.0
5. Benchmark results from Booking.com with MySQL 8.0
5
// Replication
● Relatively new because it is hard
● It is hard because of data consistency
● Running trx in // must give the same result on all slaves (= the master)
● Why is it important ?
● Computers have many Cores, using a single one for writes is a waste
● Some computer resources can give more throughput when used in parallel
(RAID1 has 2 disks  we can do 2 Read IOs in parallel)
(SSDs can serve many Read and/or Write IOs in parallel)
6
Reminder
● MySQL 5.6 has support for schema based parallel replication
● MySQL 5.7 adds support for logical clock parallel replication
● In early version, the logical clock is group commit based
● In current version, the logical clock is interval based
● MySQL 8.0 adds support for Write Set parallelism identification
● Write Set can also be found in MySQL 5.7 in Group replication
7
MySQL 5.7: LOGICAL CLOCK
● MySQL 5.7 has two slave_parallel_type:
● both need “SET GLOBAL slave_parallel_workers = N;” (with N > 1)
● DATABASE: the schema based // replication from 5.6 (not what we are talking about here)
● LOGICAL_CLOCK: “Transactions that are part of the same binary log group commit on a
master are applied in parallel on a slave.” (from the doc. but not exact: Bug#85977)
● the LOGICAL_CLOCK type is implemented by putting interval information in the binary logs
● LOGICAL_CLOCK is limited by the following:
● Problems with long/big transactions
● Problems with intermediate masters (IM)
● And it is optimized by slowing down the master to speedup the slave:
● binlog_group_commit_sync_delay
● binlog_group_commit_sync_no_delay_count
8
MySQL 5.7: LOGICAL CLOCK’
● Long transactions can block the parallel execution pipeline
● On the master: ---------------- Time --------------->
T1: B-------------------------C
T2: B--C
T3: B--C
● On the slaves: T1: B-------------------------C
T2: B-- . . . . . . . . . . . C
T3: B-- . . . . . . . . . . . C
 Try reducing as much as possible the number of big transactions:
• Easier said than done: 10 ms is big compared to 1 ms
 Avoid monster transactions (LOAD DATA, unbounded UPDATE or DELETE, …)
9
MySQL 5.7: LOGICAL CLOCK’’
● Replicating through intermediate masters (IM) shorten intervals
● Four transactions on X, Y and Z:
+---+
| X |
+---+
|
V
+---+
| Y |
+---+
|
V
+---+
| Z |
+---+
● To get maximum replication speed, replace IM by Binlog Servers (or use MySQL 8.0)
● More details at http://blog.booking.com/better_parallel_replication_for_mysql.html
10
On Y:
----Time---->
B---C
B---C
B-------C
B-------C
On Z:
----Time--------->
B---C
B---C
B-------C
B-------C
On X:
----Time---->
T1 B---C
T2 B---C
T3 B-------C
T4 B-------C
MySQL 5.7: LOGICAL CLOCK’’’
● By default, MySQL 5.7 in logical clock does out-of-order commit:
 There will be gaps (“START SLAVE UNTIL SQL_AFTER_MTS_GAPS;”)
● Not replication crash safe without GTIDs
http://jfg-mysql.blogspot.com/2016/01/replication-crash-safety-with-mts.html
● And also be careful about these:
binary logs content, SHOW SLAVE STATUS, skipping transactions, backups, …
● Using slave_preserve_commit_order = 1 does what you expect:
● This configuration does not generate gap
● But it needs log_slave_updates (feature request to remove this limitation: Bug#75396)
● Still it is not replication crash safe (surprising because no gap): Bug#80103 & Bug#81840
● And it can hang if slave_transaction_retries is too low: Bug#89247
11
MySQL // Replication Guts: Intervals
● In MySQL (5.7 and higher), each transaction is tagged with two (2) numbers:
● sequence_number: increasing id for each trx (not to confuse with GTID)
● last_committed: sequence_number of the latest trx on which this trx depends
(This can be understood as the “write view” of the current transaction)
● The last_committed / sequence_number pair is the parallelism interval
● Here an example of intervals for MySQL 5.7:
...
#170206 20:08:33 ... last_committed=6201 sequence_number=6203
#170206 20:08:33 ... last_committed=6203 sequence_number=6204
#170206 20:08:33 ... last_committed=6203 sequence_number=6205
#170206 20:08:33 ... last_committed=6203 sequence_number=6206
#170206 20:08:33 ... last_committed=6205 sequence_number=6207
... 12
MySQL 5.7 – Intervals Generation
MySQL 5.7 leverages parallelism on the master to generate intervals:
● sequence_number is an increasing id for each trx (not GTID)
(Reset to 1 at the beginning of each new binary log)
● last_committed is (in MySQL 5.7) the sequence number of the most recently
committed transaction when the current transaction gets its last lock
(Reset to 0 at the beginning of each new binary log)
...
#170206 20:08:33 ... last_committed=6201 sequence_number=6203
#170206 20:08:33 ... last_committed=6203 sequence_number=6204
#170206 20:08:33 ... last_committed=6203 sequence_number=6205
#170206 20:08:33 ... last_committed=6203 sequence_number=6206
#170206 20:08:33 ... last_committed=6205 sequence_number=6207
... 13
MySQL – Intervals Quality
● For measuring parallelism identification quality with MySQL,
we have a metric: the Average Modified Interval Length (AMIL)
● If we prefer to think in terms of group commit size, the AMIL can be mapped
to a pseudo-group commit size by multiplying the AMIL by 2 and subtracting one
● For a group commit of size n, the sum of the intervals length is n*(n+1) / 2
#170206 20:08:33 ... last_committed=6203 sequence_number=6204
#170206 20:08:33 ... last_committed=6203 sequence_number=6205
#170206 20:08:33 ... last_committed=6203 sequence_number=6206
14
MySQL – Intervals Quality
● For measuring parallelism identification quality with MySQL,
we have a metric: the Average Modified Interval Length (AMIL)
● If we prefer to think in terms of group commit size, the AMIL can be mapped
to a pseudo-group commit size by multiplying the AMIL by 2 and subtracting one
● For a group commit of size n, the sum of the intervals length is n*(n+1)/2
 AMIL = (n+1)/2 (after dividing by n), algebra gives us n = AMIL * 2 - 1
● This mapping could give a hint for slave_parallel_workers
(http://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html)
15
MySQL – Intervals Quality’
● Why do we need to “modify” the interval length ?
● Because of a limitation in the current MTS applier which will only start trx 93136
once 93131 is completed  last_committed=93124 is modified to 93131
#170206 21:19:31 ... last_committed=93124 sequence_number=93131
#170206 21:19:31 ... last_committed=93131 sequence_number=93132
#170206 21:19:31 ... last_committed=93131 sequence_number=93133
#170206 21:19:31 ... last_committed=93131 sequence_number=93134
#170206 21:19:31 ... last_committed=93131 sequence_number=93135
#170206 21:19:31 ... last_committed=93124 sequence_number=93136
#170206 21:19:31 ... last_committed=93131 sequence_number=93137
#170206 21:19:31 ... last_committed=93131 sequence_number=93138
#170206 21:19:31 ... last_committed=93132 sequence_number=93139
#170206 21:19:31 ... last_committed=93138 sequence_number=93140
16
MySQL – Intervals Quality’’
● Script to compute the Average Modified Interval Length:
file=my_binlog_index_file;
echo _first_binlog_to_analyse_ > $file;
mysqlbinlog --stop-never -R --host 127.0.0.1 $(cat $file) |
grep "^#" | grep -e last_committed -e "Rotate to" |
awk -v file=$file -F "[ t]*|=" '$11 == "last_committed" {
if (length($2) == 7) {$2 = "0" $2;}
if ($12 < max) {$12 = max;} else {max = $12;}
print $1, $2, $14 - $12;}
$10 == "Rotate"{print $12 > file; close(file); max=0;}' |
awk -F " |:" '{my_h = $2 ":" $3 ":" $4;}
NR == 1 {d=$1; h=my_h; n=0; sum=0; sum2=0;}
d != $1 || h < my_h {print d, h, n, sum, sum2; d=$1; h=my_h;}
{n++; sum += $5; sum2 += $5 * $5;}'
(https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html)
17
MySQL – Intervals Quality’’’
● Computing the AMIL needs parsing the binary logs
● This is complicated and needs to handle many special cases
● Exposing counters for computing the AMIL would be better:
● Bug#85965: Expose, on the master, counters for monitoring // information quality.
● Bug#85966: Expose, on slaves, counters for monitoring // information quality.
(https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html)
18
MySQL 5.7 – Tuning
● AMIL without and with tuning (delay) on four (4) Booking.com masters:
(speed-up the slaves by increasing binlog_group_commit_sync_delay)
19
MySQL 8.0 – Write Set
● MySQL 8.0.1 introduced a new way to identify parallelism
● Instead of setting last_committed to “the seq. number of the most
recently committed transaction when the current trx gets its last lock”…
● MySQL 8.0.1 uses “the sequence number of the last transaction that
updated the same rows as the current transaction”
● To do that, MySQL 8.0 remembers which rows (tuples) are modified by each
transaction: this is the Write Set
● Write Set are not put in the binary logs, they allow to “widen” the intervals
20
MySQL 8.0 – Write Set’
● MySQL 8.0.1 introduces new global variables to control Write Set:
● transaction_write_set_extraction = [ OFF | XXHASH64 ]
● binlog_transaction_dependency_history_size (default to 25000)
● binlog_transaction_dependency_tracking = [ COMMIT_ORDER | WRITESET_SESSION | WRITESET ]
● WRITESET_SESSION: no two updates from the same session can be reordered
● WRITESET: any transactions which write different tuples can be parallelized
● WRITESET_SESSION will not work well for cnx recycling (Cnx Pools or Proxies):
● Recycling a connection with WRITESET_SESSION impedes parallelism identification
● Unless using the function reset_connection (with Bug#86063 fixed in 8.0.4)
21
MySQL 8.0 – Write Set’’
● To use Write Set on a Master:
● transaction_write_set_extraction = XXHASH64
● binlog_transaction_dependency_tracking = [ WRITESET_SESSION | WRITESET ]
(if WRITESET, slave_preserve_commit_order can avoid temporary inconsistencies)
● To use Write Set on an Intermediate Master (even single-threaded):
● transaction_write_set_extraction = XXHASH64
● binlog_transaction_dependency_tracking = WRITESET
(slave_preserve_commit_order can avoid temporary inconsistencies)
● To stop using Write Set:
● binlog_transaction_dependency_tracking = COMMIT_ORDER
● transaction_write_set_extraction = OFF
22
MySQL 8.0 – Write Set’’’
● Result for single-threaded Booking.com Intermediate Master (before and after):
#170409 3:37:13 [...] last_committed=6695 sequence_number=6696 [...]
#170409 3:37:14 [...] last_committed=6696 sequence_number=6697 [...]
#170409 3:37:14 [...] last_committed=6697 sequence_number=6698 [...]
#170409 3:37:14 [...] last_committed=6698 sequence_number=6699 [...]
#170409 3:37:14 [...] last_committed=6699 sequence_number=6700 [...]
#170409 3:37:14 [...] last_committed=6700 sequence_number=6701 [...]
#170409 3:37:14 [...] last_committed=6700 sequence_number=6702 [...]
#170409 3:37:14 [...] last_committed=6700 sequence_number=6703 [...]
#170409 3:37:14 [...] last_committed=6700 sequence_number=6704 [...]
#170409 3:37:14 [...] last_committed=6704 sequence_number=6705 [...]
#170409 3:37:14 [...] last_committed=6700 sequence_number=6706 [...]
23
MySQL 8.0 – Write Set’’’ ’
#170409 3:37:17 [...] last_committed=6700 sequence_number=6766 [...]
#170409 3:37:17 [...] last_committed=6752 sequence_number=6767 [...]
#170409 3:37:17 [...] last_committed=6753 sequence_number=6768 [...]
#170409 3:37:17 [...] last_committed=6700 sequence_number=6769 [...]
[...]
#170409 3:37:18 [...] last_committed=6700 sequence_number=6783 [...]
#170409 3:37:18 [...] last_committed=6768 sequence_number=6784 [...]
#170409 3:37:18 [...] last_committed=6784 sequence_number=6785 [...]
#170409 3:37:18 [...] last_committed=6785 sequence_number=6786 [...]
#170409 3:37:18 [...] last_committed=6785 sequence_number=6787 [...]
[...]
#170409 3:37:22 [...] last_committed=6785 sequence_number=6860 [...]
#170409 3:37:22 [...] last_committed=6842 sequence_number=6861 [...]
#170409 3:37:22 [...] last_committed=6843 sequence_number=6862 [...]
#170409 3:37:22 [...] last_committed=6785 sequence_number=6863
MySQL 8.0 – AMIL of Write Set
25
● AMIL on a single-threaded 8.0.1 Intermediate Master (IM) without/with Write Set:
MySQL 8.0 – Write Set vs Delay
● AMIL on Booking.com masters with delay vs Write Set on Intermediate Master:
26
MySQL 8.0 – Write Set’’’ ’’
● Write Set advantages:
● No need to slowdown the master (maybe not true in all cases)
● Will work even at low concurrency on the master
● Allows to test without upgrading the master (works on an intermediate master)
(however, this sacrifices session consistency, which might give optimistic results)
● Mitigate the problem of losing parallelism via intermediate masters
(only with binlog_transaction_dependency_tracking = WRITESET)
( the best solution is still Binlog Servers)
27
MySQL 8.0 – Write Set’’’ ’’’
● Write Set limitations:
● Needs Row-Based-Replication on the master (or intermediate master)
● Not working for trx updating tables without PK and trx updating tables having FK
(it will fall back to COMMIT_ORDER for those transactions)
● Barrier at each DDL (Bug#86060 for adding counters)
● Barrier at each binary log rotation: no transactions in different binlogs can be run in //
● With WRITESET_SESSION, does not play well with connection recycling
(Could use COM_RESET_CONNECTION if Bug#86063 is fixed)
● Write Set drawbacks:
● Slowdown the master ? Consume more RAM ?
● New technology: not fully mastered yet and there are bugs (still 1st DMR release)
28
MySQL 8.0 – Write Set @ B.com
● Tests on eight (8) real Booking.com environments (different workloads):
● A is MySQL 5.6 and 5.7 masters (1 and 7), some are SBR (4) some are RBR (4)
● B is MySQL 8.0.3 Intermediate Master with Write Set (RBR)
set global transaction_write_set_extraction = XXHASH64;
set global binlog_transaction_dependency_tracking = WRITESET;
● C is a slave with local SSD storage
+---+ +---+ +---+
| A | -------> | B | -------> | C |
+---+ +---+ +---+
● Run with 0, 2, 4, 8, 16, 32, 64, 128 and 256 workers,
with High Durability (HD - sync_binlog = 1 & trx commit = 1) and No Durability (ND – 0 & 2),
without and with slave_preserve_commit_order (NO and WO)
with and without log_slave_updates (IM and SB)
MySQL 8.0 – Write Set Speedups
30
E5 IM-HD Single-Threaded: 6138 seconds
E5 IM-ND Single-Threaded: 2238 seconds
MySQL 8.0 – Write Set Speedups’
31
MySQL 8.0 – Write Set Speedups’’
32
MySQL 8.0 – Write Set Speedups’’’
33
MySQL 8.0 – Speedup Summary
● No thrashing when too many threads !
● For the environments with High Durability:
● Two (2) “interesting” speedups: 1.6, 1.7
● One (1) good: 2.7
● Four (4) very good speedups: 4.4, 4.5, 5.6, and 5.8
● One (1) great speedups: 10.8 !
● For the environments without durability (ND):
● Three (3) good speedups: 1.3, 1.5 and 1.8
● Three (3) very good speedups: 2.4, 2.8 and 2.9
● Two (2) great speedups: 3.7 and 4.8 !
● All that without tuning MySQL or the application
MySQL 8.0 – Looking at low speedups
35
MySQL 8.0 – Looking at low speedups’
36
MySQL 8.0 – Looking at good speedups
37
MySQL 8.0 – Looking at good speedups’
38
MySQL 8.0 – Looking at great speedups
39
How do we tune slave_parallel_workers?
Goal of tuning:
find a value that gives good speedup
without wasting too much resources
We can use:
- Commit rate
- Replication catch-up speed
But these are workload - dependent, making optimization harder:
- Workload variations over time
- Hardware differences
Wouldn’t it be cool to have a workload - independent
metric to help optimize slave_parallel_workers?
Such a metric would:
● Have a good value when all of your applier
threads are doing work
● Be sensitive to having idle threads.
Sounds like a resource allocation fairness
problem...
In a fair world...
● slave_parallel_workers = n
● Each worker performs an equal amount of “work”
Say n = 5
Stuff that would not be good:
● No actual parallelism
In an unfair world...
● Poor parallelism
How do we define “work”?
Inspired by Percona blog post by Stephane Combaudon
(https://www.percona.com/blog/2016/02/10/estimating-potential-for-mysql-5-7-parallel-replication/)
● Percentage of commits applied per worker thread
Easy to measure from performance schema:
USE performance_schema;
-- SET UP PS instrumentation
UPDATE setup_consumers SET enabled = 'YES' WHERE NAME LIKE 'events_transactions%';
UPDATE setup_instruments SET enabled = 'YES', timed = 'YES' WHERE NAME = 'transaction';
-- Get information on commits per applier thread
SELECT T.thread_id, T.count_star AS COUNT_STAR
FROM events_transactions_summary_by_thread_by_event_name T
WHERE T.thread_id IN (SELECT thread_id FROM replication_applier_status_by_worker);
What does it give?
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63581 | 394 |
| 63582 | 292 |
| 63583 | 272 |
| 63584 | 260 |
| 63585 | 251 |
| 63586 | 237 |
[...]
| 63606 | 108 |
| 63607 | 106 |
| 63608 | 104 |
| 63609 | 100 |
| 63610 | 98 |
| 63611 | 94 |
| 63612 | 89 |
+-----------+------------+
32 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63627 | 749 |
| 63628 | 504 |
| 63629 | 457 |
| 63630 | 433 |
| 63631 | 405 |
| 63632 | 384 |
[...]
| 63684 | 73 |
| 63685 | 73 |
| 63686 | 71 |
| 63687 | 68 |
| 63688 | 65 |
| 63689 | 65 |
| 63690 | 62 |
+-----------+------------+
64 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63704 | 711 |
| 63705 | 468 |
| 63706 | 420 |
| 63707 | 378 |
| 63708 | 359 |
| 63709 | 340 |
[...]
| 63825 | 3 |
| 63826 | 3 |
| 63827 | 3 |
| 63828 | 3 |
| 63829 | 3 |
| 63830 | 2 |
| 63831 | 2 |
+-----------+------------+
128 rows in set (0.00 sec)
What does it give?
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63581 | 394 |
| 63582 | 292 |
| 63583 | 272 |
| 63584 | 260 |
| 63585 | 251 |
| 63586 | 237 |
[...]
| 63606 | 108 |
| 63607 | 106 |
| 63608 | 104 |
| 63609 | 100 |
| 63610 | 98 |
| 63611 | 94 |
| 63612 | 89 |
+-----------+------------+
32 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63627 | 749 |
| 63628 | 504 |
| 63629 | 457 |
| 63630 | 433 |
| 63631 | 405 |
| 63632 | 384 |
[...]
| 63684 | 73 |
| 63685 | 73 |
| 63686 | 71 |
| 63687 | 68 |
| 63688 | 65 |
| 63689 | 65 |
| 63690 | 62 |
+-----------+------------+
64 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63704 | 711 |
| 63705 | 468 |
| 63706 | 420 |
| 63707 | 378 |
| 63708 | 359 |
| 63709 | 340 |
[...]
| 63825 | 3 |
| 63826 | 3 |
| 63827 | 3 |
| 63828 | 3 |
| 63829 | 3 |
| 63830 | 2 |
| 63831 | 2 |
+-----------+------------+
128 rows in set (0.00 sec)
How can we understand those numbers: Jain’s index
Jain’s fairness index:
● 𝑛: number of applier threads.
● 𝑖: the i-th applier thread.
● 𝑥𝑖 :amount of work performed by thread
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63627 | 749 |
| 63628 | 504 |
| 63629 | 457 |
| 63630 | 433 |
| 63631 | 405 |
| 63632 | 384 |
[...]
| 63684 | 73 |
| 63685 | 73 |
| 63686 | 71 |
| 63687 | 68 |
| 63688 | 65 |
| 63689 | 65 |
| 63690 | 62 |
+-----------+------------+
64 rows in set (0.00 sec)
What does Jain’s index mean?
- 0 ≤ J ≤ 1
- For n = 1  J = 1 (but no parallelism)
- If n > 1 and J ≅ 1, all workers are doing similar work
- If n > 1 and J < 1, some workers are doing less work
- If n > 1 and J << 1, some workers are idle
Avoid J ≅ 1
And avoid J << 1
0
1
Putting the idea to the test
5.7 Master
5.7 replica
8.0 Intermediate
master
8.0 replicaGroup commit sync delay = 100 ms
Group commit sync no delay count=50
Writeset enabled
Replication stopped for 24 h
Applier threads: 1 ≤ n ≤ 128
Replication stopped for 24 h
Applier threads: 1 ≤ n ≤ 128
First, something familiar to compare with:
Optimal values for n:
• For 5.7, n = 8 or 16
• For 8.0, n = 32 or 64
Higher values of n would be
wasting resources on
applier threads that have no
work to do.
0
1000
2000
3000
4000
5000
6000
0 20 40 60 80 100 120 140
Commit rate vs n
C / s (5.7) C / s (8.0)
What does Jain´s index have to say?
0
0,2
0,4
0,6
0,8
1
1,2
0 20 40 60 80 100 120 140
J vs n
J (5.7) J (8.0)
Caveats...
● What happens when not catching up but just keeping up with current replication traffic?
Not enough work means idle threads
 Optimal number of workers depends on catching up or keeping up
Another way to look at the numbers?
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63581 | 394 |
| 63582 | 292 |
| 63583 | 272 |
| 63584 | 260 |
| 63585 | 251 |
| 63586 | 237 |
[...]
| 63606 | 108 |
| 63607 | 106 |
| 63608 | 104 |
| 63609 | 100 |
| 63610 | 98 |
| 63611 | 94 |
| 63612 | 89 |
+-----------+------------+
32 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63627 | 749 |
| 63628 | 504 |
| 63629 | 457 |
| 63630 | 433 |
| 63631 | 405 |
| 63632 | 384 |
[...]
| 63684 | 73 |
| 63685 | 73 |
| 63686 | 71 |
| 63687 | 68 |
| 63688 | 65 |
| 63689 | 65 |
| 63690 | 62 |
+-----------+------------+
64 rows in set (0.00 sec)
> SELECT [...];
+-----------+------------+
| thread_id | COUNT_STAR |
+-----------+------------+
| 63704 | 711 |
| 63705 | 468 |
| 63706 | 420 |
| 63707 | 378 |
| 63708 | 359 |
| 63709 | 340 |
[...]
| 63825 | 3 |
| 63826 | 3 |
| 63827 | 3 |
| 63828 | 3 |
| 63829 | 3 |
| 63830 | 2 |
| 63831 | 2 |
+-----------+------------+
128 rows in set (0.00 sec)
MySQL 8.0 – What about commit order ?
55
MySQL 8.0 – What about commit order ?’
56
MySQL 5.7 and 8.0 // Repl. Summary
● Parallel replication in MySQL 5.7 is not simple:
● Need precise tuning
● Long transactions block the parallel replication pipeline
● Care about Intermediate masters
● Write Set in MySQL 8.0 gives very interesting results:
● No problem with Intermediate masters
● Allows to test with Intermediate Master
● Some great speedups and most of them very good
And please
test by yourself
and share results
// Replication: Links
● Replication crash safety with MTS in MySQL 5.6 and 5.7: reality or illusion?
https://jfg-mysql.blogspot.com/2016/01/replication-crash-safety-with-mts.html
● A Metric for Tuning Parallel Replication in MySQL 5.7
https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html
● Solving MySQL Replication Lag with LOGICAL_CLOCK and Calibrated Delay
https://www.vividcortex.com/blog/solving-mysql-replication-lag-with-logical_clock-and-calibrated-delay
● How to Fix a Lagging MySQL Replication
https://thoughts.t37.net/fixing-a-very-lagging-mysql-replication-db6eb5a6e15d
● Binlog Servers:
● http://blog.booking.com/mysql_slave_scaling_and_more.html
● Better Parallel Replication for MySQL: http://blog.booking.com/better_parallel_replication_for_mysql.html
● http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfiguring_slaves.html
59
// Replication: Links’
● An update on Write Set (parallel replication) bug fix in MySQL 8.0
https://jfg-mysql.blogspot.com/2018/01/an-update-on-write-set-parallel-replication-bug-fix-in-mysql-8-0.html
● Write Set in MySQL 5.7: Group Replication
https://jfg-mysql.blogspot.com/2018/01/write-set-in-mysql-5-7-group-replication.html
● More Write Set in MySQL: Group Replication Certification
https://jfg-mysql.blogspot.com/2018/01/more-write-set-in-mysql-5-7-group-replication-certification.html
60
// Replication: Links’’
● Bugs/feature requests:
● The doc. of slave-parallel-type=LOGICAL_CLOCK wrongly reference Group Commit: Bug#85977
● Allow slave_preserve_commit_order without log-slave-updates: Bug#75396
● MTS with slave_preserve_commit_order not repl. crash safe: Bug#80103
● Automatic Repl. Recovery Does Not Handle Lost Relay Log Events: Bug#81840
● Expose, on the master/slave, counters for monitoring // info. quality: Bug#85965 & Bug#85966
● Expose counters for monitoring Write Set barriers: Bug#86060
● Deadlock with slave_preserve_commit_order=ON with Bug#86078: Bug#86079 & Bug#89247
● Fixed bugs:
● Message after MTS crash misleading: Bug#80102 (and Bug#77496)
● Replication position lost after crash on MTS configured slave: Bug#77496
● Full table scan bug in InnoDB: MDEV-10649, Bug#82968 and Bug#82969
● The function reset_connection does not reset Write Set in WRITESET_SESSION: Bug#86063
● Bad Write Set tracking with UNIQUE KEY on a DELETE followed by an INSERT: Bug#86078
Thanks
Eduardo Ortega (MySQL Database Engineer)
eduardo DOT ortega AT booking.com
Jean-François Gagné (System Engineer)
jeanfrancois DOT gagne AT booking.com

Weitere ähnliche Inhalte

Was ist angesagt?

MariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAsMariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAsFederico Razzoli
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting Mydbops
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseSeveralnines
 
MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바NeoClova
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScaleMariaDB plc
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability Mydbops
 
binary log と 2PC と Group Commit
binary log と 2PC と Group Commitbinary log と 2PC と Group Commit
binary log と 2PC と Group CommitTakanori Sejima
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11Kenny Gryp
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Mydbops
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기NeoClova
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Mydbops
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PGConf APAC
 
What is new in MariaDB 10.6?
What is new in MariaDB 10.6?What is new in MariaDB 10.6?
What is new in MariaDB 10.6?Mydbops
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagJean-François Gagné
 
M|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleM|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleMariaDB plc
 
Innodb Deep Talk #2 でお話したスライド
Innodb Deep Talk #2 でお話したスライドInnodb Deep Talk #2 でお話したスライド
Innodb Deep Talk #2 でお話したスライドYasufumi Kinoshita
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceMariaDB plc
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Mydbops
 
MariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & OptimizationMariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & OptimizationMariaDB plc
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityColin Charles
 

Was ist angesagt? (20)

MariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAsMariaDB 10.11 key features overview for DBAs
MariaDB 10.11 key features overview for DBAs
 
MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting MySQL GTID Concepts, Implementation and troubleshooting
MySQL GTID Concepts, Implementation and troubleshooting
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash Course
 
MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 
binary log と 2PC と Group Commit
binary log と 2PC と Group Commitbinary log と 2PC と Group Commit
binary log と 2PC と Group Commit
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
What is new in MariaDB 10.6?
What is new in MariaDB 10.6?What is new in MariaDB 10.6?
What is new in MariaDB 10.6?
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
 
M|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleM|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScale
 
Innodb Deep Talk #2 でお話したスライド
Innodb Deep Talk #2 でお話したスライドInnodb Deep Talk #2 でお話したスライド
Innodb Deep Talk #2 でお話したスライド
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
MariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & OptimizationMariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & Optimization
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High Availability
 

Ähnlich wie MySQL Parallel Replication by Booking.com

MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsJean-François Gagné
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBSeveralnines
 
MySQL Timeout Variables Explained
MySQL Timeout Variables Explained MySQL Timeout Variables Explained
MySQL Timeout Variables Explained Mydbops
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedis Labs
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterSeveralnines
 
More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)Valeriy Kravchuk
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeOlivier DASINI
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamJean-François Gagné
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
IT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New FeaturesIT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New FeaturesFromDual GmbH
 
MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8Frederic Descamps
 
Lightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedLightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedScyllaDB
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellEmily Ikuta
 
MariaDB 10.2 New Features
MariaDB 10.2 New FeaturesMariaDB 10.2 New Features
MariaDB 10.2 New FeaturesFromDual GmbH
 
PL22 - Backup and Restore Performance.pptx
PL22 - Backup and Restore Performance.pptxPL22 - Backup and Restore Performance.pptx
PL22 - Backup and Restore Performance.pptxVinicius M Grippa
 
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Mydbops
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationYi Pan
 

Ähnlich wie MySQL Parallel Replication by Booking.com (20)

MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitationsMySQL/MariaDB Parallel Replication: inventory, use-case and limitations
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitations
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
 
MySQL Timeout Variables Explained
MySQL Timeout Variables Explained MySQL Timeout Variables Explained
MySQL Timeout Variables Explained
 
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in RedisRedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
RedisConf18 - Fail-Safe Starvation-Free Durable Priority Queues in Redis
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera Cluster
 
More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)More on gdb for my sql db as (fosdem 2016)
More on gdb for my sql db as (fosdem 2016)
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtime
 
Riding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication StreamRiding the Binlog: an in Deep Dissection of the Replication Stream
Riding the Binlog: an in Deep Dissection of the Replication Stream
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
IT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New FeaturesIT Tage 2019 MariaDB 10.4 New Features
IT Tage 2019 MariaDB 10.4 New Features
 
MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8
 
Lightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedLightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning Speed
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
MariaDB 10.2 New Features
MariaDB 10.2 New FeaturesMariaDB 10.2 New Features
MariaDB 10.2 New Features
 
PL22 - Backup and Restore Performance.pptx
PL22 - Backup and Restore Performance.pptxPL22 - Backup and Restore Performance.pptx
PL22 - Backup and Restore Performance.pptx
 
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 

Mehr von Jean-François Gagné

Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
 
Autopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterAutopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyJean-François Gagné
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comJean-François Gagné
 

Mehr von Jean-François Gagné (6)

Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorAlmost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
 
Autopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation DisasterAutopsy of a MySQL Automation Disaster
Autopsy of a MySQL Automation Disaster
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
The two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.comThe two little bugs that almost brought down Booking.com
The two little bugs that almost brought down Booking.com
 
Autopsy of an automation disaster
Autopsy of an automation disasterAutopsy of an automation disaster
Autopsy of an automation disaster
 

Kürzlich hochgeladen

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

MySQL Parallel Replication by Booking.com

  • 1. Parallel Replication in MySQL 5.7 and 8.0 by Booking.com Presented at Pre-FOSDEM MySQL Day on Friday February 2nd, 2018 Eduardo Ortega (MySQL Database Engineer) eduardo DTA ortega AT booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagne AT booking.com
  • 2. 2
  • 3. Booking.com ● Based in Amsterdam since 1996 ● Online Hotel and Accommodation (Travel) Agent (OTA): ● +1.636.000 properties in 229 countries ● +1.555.000 room nights reserved daily ● +40 languages (website and customer service) ● +15.000 people working in 198 offices worldwide ● Part of the Priceline Group ● And we use MySQL: ● Thousands (1000s) of servers 3
  • 4. Booking.com’ ● And we are hiring ! ● MySQL Engineer / DBA ● System Administrator ● System Engineer ● Site Reliability Engineer ● Developer / Designer ● Technical Team Lead ● Product Owner ● Data Scientist ● And many more… ● https://workingatbooking.com/ 4
  • 5. Session Summary 1. Introducing Parallel Replication (// Replication) 2. MySQL 5.7: Logical Clock and Intervals 3. MySQL 5.7: Tuning Intervals 4. Write Set in MySQL 8.0 5. Benchmark results from Booking.com with MySQL 8.0 5
  • 6. // Replication ● Relatively new because it is hard ● It is hard because of data consistency ● Running trx in // must give the same result on all slaves (= the master) ● Why is it important ? ● Computers have many Cores, using a single one for writes is a waste ● Some computer resources can give more throughput when used in parallel (RAID1 has 2 disks  we can do 2 Read IOs in parallel) (SSDs can serve many Read and/or Write IOs in parallel) 6
  • 7. Reminder ● MySQL 5.6 has support for schema based parallel replication ● MySQL 5.7 adds support for logical clock parallel replication ● In early version, the logical clock is group commit based ● In current version, the logical clock is interval based ● MySQL 8.0 adds support for Write Set parallelism identification ● Write Set can also be found in MySQL 5.7 in Group replication 7
  • 8. MySQL 5.7: LOGICAL CLOCK ● MySQL 5.7 has two slave_parallel_type: ● both need “SET GLOBAL slave_parallel_workers = N;” (with N > 1) ● DATABASE: the schema based // replication from 5.6 (not what we are talking about here) ● LOGICAL_CLOCK: “Transactions that are part of the same binary log group commit on a master are applied in parallel on a slave.” (from the doc. but not exact: Bug#85977) ● the LOGICAL_CLOCK type is implemented by putting interval information in the binary logs ● LOGICAL_CLOCK is limited by the following: ● Problems with long/big transactions ● Problems with intermediate masters (IM) ● And it is optimized by slowing down the master to speedup the slave: ● binlog_group_commit_sync_delay ● binlog_group_commit_sync_no_delay_count 8
  • 9. MySQL 5.7: LOGICAL CLOCK’ ● Long transactions can block the parallel execution pipeline ● On the master: ---------------- Time ---------------> T1: B-------------------------C T2: B--C T3: B--C ● On the slaves: T1: B-------------------------C T2: B-- . . . . . . . . . . . C T3: B-- . . . . . . . . . . . C  Try reducing as much as possible the number of big transactions: • Easier said than done: 10 ms is big compared to 1 ms  Avoid monster transactions (LOAD DATA, unbounded UPDATE or DELETE, …) 9
  • 10. MySQL 5.7: LOGICAL CLOCK’’ ● Replicating through intermediate masters (IM) shorten intervals ● Four transactions on X, Y and Z: +---+ | X | +---+ | V +---+ | Y | +---+ | V +---+ | Z | +---+ ● To get maximum replication speed, replace IM by Binlog Servers (or use MySQL 8.0) ● More details at http://blog.booking.com/better_parallel_replication_for_mysql.html 10 On Y: ----Time----> B---C B---C B-------C B-------C On Z: ----Time---------> B---C B---C B-------C B-------C On X: ----Time----> T1 B---C T2 B---C T3 B-------C T4 B-------C
  • 11. MySQL 5.7: LOGICAL CLOCK’’’ ● By default, MySQL 5.7 in logical clock does out-of-order commit:  There will be gaps (“START SLAVE UNTIL SQL_AFTER_MTS_GAPS;”) ● Not replication crash safe without GTIDs http://jfg-mysql.blogspot.com/2016/01/replication-crash-safety-with-mts.html ● And also be careful about these: binary logs content, SHOW SLAVE STATUS, skipping transactions, backups, … ● Using slave_preserve_commit_order = 1 does what you expect: ● This configuration does not generate gap ● But it needs log_slave_updates (feature request to remove this limitation: Bug#75396) ● Still it is not replication crash safe (surprising because no gap): Bug#80103 & Bug#81840 ● And it can hang if slave_transaction_retries is too low: Bug#89247 11
  • 12. MySQL // Replication Guts: Intervals ● In MySQL (5.7 and higher), each transaction is tagged with two (2) numbers: ● sequence_number: increasing id for each trx (not to confuse with GTID) ● last_committed: sequence_number of the latest trx on which this trx depends (This can be understood as the “write view” of the current transaction) ● The last_committed / sequence_number pair is the parallelism interval ● Here an example of intervals for MySQL 5.7: ... #170206 20:08:33 ... last_committed=6201 sequence_number=6203 #170206 20:08:33 ... last_committed=6203 sequence_number=6204 #170206 20:08:33 ... last_committed=6203 sequence_number=6205 #170206 20:08:33 ... last_committed=6203 sequence_number=6206 #170206 20:08:33 ... last_committed=6205 sequence_number=6207 ... 12
  • 13. MySQL 5.7 – Intervals Generation MySQL 5.7 leverages parallelism on the master to generate intervals: ● sequence_number is an increasing id for each trx (not GTID) (Reset to 1 at the beginning of each new binary log) ● last_committed is (in MySQL 5.7) the sequence number of the most recently committed transaction when the current transaction gets its last lock (Reset to 0 at the beginning of each new binary log) ... #170206 20:08:33 ... last_committed=6201 sequence_number=6203 #170206 20:08:33 ... last_committed=6203 sequence_number=6204 #170206 20:08:33 ... last_committed=6203 sequence_number=6205 #170206 20:08:33 ... last_committed=6203 sequence_number=6206 #170206 20:08:33 ... last_committed=6205 sequence_number=6207 ... 13
  • 14. MySQL – Intervals Quality ● For measuring parallelism identification quality with MySQL, we have a metric: the Average Modified Interval Length (AMIL) ● If we prefer to think in terms of group commit size, the AMIL can be mapped to a pseudo-group commit size by multiplying the AMIL by 2 and subtracting one ● For a group commit of size n, the sum of the intervals length is n*(n+1) / 2 #170206 20:08:33 ... last_committed=6203 sequence_number=6204 #170206 20:08:33 ... last_committed=6203 sequence_number=6205 #170206 20:08:33 ... last_committed=6203 sequence_number=6206 14
  • 15. MySQL – Intervals Quality ● For measuring parallelism identification quality with MySQL, we have a metric: the Average Modified Interval Length (AMIL) ● If we prefer to think in terms of group commit size, the AMIL can be mapped to a pseudo-group commit size by multiplying the AMIL by 2 and subtracting one ● For a group commit of size n, the sum of the intervals length is n*(n+1)/2  AMIL = (n+1)/2 (after dividing by n), algebra gives us n = AMIL * 2 - 1 ● This mapping could give a hint for slave_parallel_workers (http://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html) 15
  • 16. MySQL – Intervals Quality’ ● Why do we need to “modify” the interval length ? ● Because of a limitation in the current MTS applier which will only start trx 93136 once 93131 is completed  last_committed=93124 is modified to 93131 #170206 21:19:31 ... last_committed=93124 sequence_number=93131 #170206 21:19:31 ... last_committed=93131 sequence_number=93132 #170206 21:19:31 ... last_committed=93131 sequence_number=93133 #170206 21:19:31 ... last_committed=93131 sequence_number=93134 #170206 21:19:31 ... last_committed=93131 sequence_number=93135 #170206 21:19:31 ... last_committed=93124 sequence_number=93136 #170206 21:19:31 ... last_committed=93131 sequence_number=93137 #170206 21:19:31 ... last_committed=93131 sequence_number=93138 #170206 21:19:31 ... last_committed=93132 sequence_number=93139 #170206 21:19:31 ... last_committed=93138 sequence_number=93140 16
  • 17. MySQL – Intervals Quality’’ ● Script to compute the Average Modified Interval Length: file=my_binlog_index_file; echo _first_binlog_to_analyse_ > $file; mysqlbinlog --stop-never -R --host 127.0.0.1 $(cat $file) | grep "^#" | grep -e last_committed -e "Rotate to" | awk -v file=$file -F "[ t]*|=" '$11 == "last_committed" { if (length($2) == 7) {$2 = "0" $2;} if ($12 < max) {$12 = max;} else {max = $12;} print $1, $2, $14 - $12;} $10 == "Rotate"{print $12 > file; close(file); max=0;}' | awk -F " |:" '{my_h = $2 ":" $3 ":" $4;} NR == 1 {d=$1; h=my_h; n=0; sum=0; sum2=0;} d != $1 || h < my_h {print d, h, n, sum, sum2; d=$1; h=my_h;} {n++; sum += $5; sum2 += $5 * $5;}' (https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html) 17
  • 18. MySQL – Intervals Quality’’’ ● Computing the AMIL needs parsing the binary logs ● This is complicated and needs to handle many special cases ● Exposing counters for computing the AMIL would be better: ● Bug#85965: Expose, on the master, counters for monitoring // information quality. ● Bug#85966: Expose, on slaves, counters for monitoring // information quality. (https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html) 18
  • 19. MySQL 5.7 – Tuning ● AMIL without and with tuning (delay) on four (4) Booking.com masters: (speed-up the slaves by increasing binlog_group_commit_sync_delay) 19
  • 20. MySQL 8.0 – Write Set ● MySQL 8.0.1 introduced a new way to identify parallelism ● Instead of setting last_committed to “the seq. number of the most recently committed transaction when the current trx gets its last lock”… ● MySQL 8.0.1 uses “the sequence number of the last transaction that updated the same rows as the current transaction” ● To do that, MySQL 8.0 remembers which rows (tuples) are modified by each transaction: this is the Write Set ● Write Set are not put in the binary logs, they allow to “widen” the intervals 20
  • 21. MySQL 8.0 – Write Set’ ● MySQL 8.0.1 introduces new global variables to control Write Set: ● transaction_write_set_extraction = [ OFF | XXHASH64 ] ● binlog_transaction_dependency_history_size (default to 25000) ● binlog_transaction_dependency_tracking = [ COMMIT_ORDER | WRITESET_SESSION | WRITESET ] ● WRITESET_SESSION: no two updates from the same session can be reordered ● WRITESET: any transactions which write different tuples can be parallelized ● WRITESET_SESSION will not work well for cnx recycling (Cnx Pools or Proxies): ● Recycling a connection with WRITESET_SESSION impedes parallelism identification ● Unless using the function reset_connection (with Bug#86063 fixed in 8.0.4) 21
  • 22. MySQL 8.0 – Write Set’’ ● To use Write Set on a Master: ● transaction_write_set_extraction = XXHASH64 ● binlog_transaction_dependency_tracking = [ WRITESET_SESSION | WRITESET ] (if WRITESET, slave_preserve_commit_order can avoid temporary inconsistencies) ● To use Write Set on an Intermediate Master (even single-threaded): ● transaction_write_set_extraction = XXHASH64 ● binlog_transaction_dependency_tracking = WRITESET (slave_preserve_commit_order can avoid temporary inconsistencies) ● To stop using Write Set: ● binlog_transaction_dependency_tracking = COMMIT_ORDER ● transaction_write_set_extraction = OFF 22
  • 23. MySQL 8.0 – Write Set’’’ ● Result for single-threaded Booking.com Intermediate Master (before and after): #170409 3:37:13 [...] last_committed=6695 sequence_number=6696 [...] #170409 3:37:14 [...] last_committed=6696 sequence_number=6697 [...] #170409 3:37:14 [...] last_committed=6697 sequence_number=6698 [...] #170409 3:37:14 [...] last_committed=6698 sequence_number=6699 [...] #170409 3:37:14 [...] last_committed=6699 sequence_number=6700 [...] #170409 3:37:14 [...] last_committed=6700 sequence_number=6701 [...] #170409 3:37:14 [...] last_committed=6700 sequence_number=6702 [...] #170409 3:37:14 [...] last_committed=6700 sequence_number=6703 [...] #170409 3:37:14 [...] last_committed=6700 sequence_number=6704 [...] #170409 3:37:14 [...] last_committed=6704 sequence_number=6705 [...] #170409 3:37:14 [...] last_committed=6700 sequence_number=6706 [...] 23
  • 24. MySQL 8.0 – Write Set’’’ ’ #170409 3:37:17 [...] last_committed=6700 sequence_number=6766 [...] #170409 3:37:17 [...] last_committed=6752 sequence_number=6767 [...] #170409 3:37:17 [...] last_committed=6753 sequence_number=6768 [...] #170409 3:37:17 [...] last_committed=6700 sequence_number=6769 [...] [...] #170409 3:37:18 [...] last_committed=6700 sequence_number=6783 [...] #170409 3:37:18 [...] last_committed=6768 sequence_number=6784 [...] #170409 3:37:18 [...] last_committed=6784 sequence_number=6785 [...] #170409 3:37:18 [...] last_committed=6785 sequence_number=6786 [...] #170409 3:37:18 [...] last_committed=6785 sequence_number=6787 [...] [...] #170409 3:37:22 [...] last_committed=6785 sequence_number=6860 [...] #170409 3:37:22 [...] last_committed=6842 sequence_number=6861 [...] #170409 3:37:22 [...] last_committed=6843 sequence_number=6862 [...] #170409 3:37:22 [...] last_committed=6785 sequence_number=6863
  • 25. MySQL 8.0 – AMIL of Write Set 25 ● AMIL on a single-threaded 8.0.1 Intermediate Master (IM) without/with Write Set:
  • 26. MySQL 8.0 – Write Set vs Delay ● AMIL on Booking.com masters with delay vs Write Set on Intermediate Master: 26
  • 27. MySQL 8.0 – Write Set’’’ ’’ ● Write Set advantages: ● No need to slowdown the master (maybe not true in all cases) ● Will work even at low concurrency on the master ● Allows to test without upgrading the master (works on an intermediate master) (however, this sacrifices session consistency, which might give optimistic results) ● Mitigate the problem of losing parallelism via intermediate masters (only with binlog_transaction_dependency_tracking = WRITESET) ( the best solution is still Binlog Servers) 27
  • 28. MySQL 8.0 – Write Set’’’ ’’’ ● Write Set limitations: ● Needs Row-Based-Replication on the master (or intermediate master) ● Not working for trx updating tables without PK and trx updating tables having FK (it will fall back to COMMIT_ORDER for those transactions) ● Barrier at each DDL (Bug#86060 for adding counters) ● Barrier at each binary log rotation: no transactions in different binlogs can be run in // ● With WRITESET_SESSION, does not play well with connection recycling (Could use COM_RESET_CONNECTION if Bug#86063 is fixed) ● Write Set drawbacks: ● Slowdown the master ? Consume more RAM ? ● New technology: not fully mastered yet and there are bugs (still 1st DMR release) 28
  • 29. MySQL 8.0 – Write Set @ B.com ● Tests on eight (8) real Booking.com environments (different workloads): ● A is MySQL 5.6 and 5.7 masters (1 and 7), some are SBR (4) some are RBR (4) ● B is MySQL 8.0.3 Intermediate Master with Write Set (RBR) set global transaction_write_set_extraction = XXHASH64; set global binlog_transaction_dependency_tracking = WRITESET; ● C is a slave with local SSD storage +---+ +---+ +---+ | A | -------> | B | -------> | C | +---+ +---+ +---+ ● Run with 0, 2, 4, 8, 16, 32, 64, 128 and 256 workers, with High Durability (HD - sync_binlog = 1 & trx commit = 1) and No Durability (ND – 0 & 2), without and with slave_preserve_commit_order (NO and WO) with and without log_slave_updates (IM and SB)
  • 30. MySQL 8.0 – Write Set Speedups 30 E5 IM-HD Single-Threaded: 6138 seconds E5 IM-ND Single-Threaded: 2238 seconds
  • 31. MySQL 8.0 – Write Set Speedups’ 31
  • 32. MySQL 8.0 – Write Set Speedups’’ 32
  • 33. MySQL 8.0 – Write Set Speedups’’’ 33
  • 34. MySQL 8.0 – Speedup Summary ● No thrashing when too many threads ! ● For the environments with High Durability: ● Two (2) “interesting” speedups: 1.6, 1.7 ● One (1) good: 2.7 ● Four (4) very good speedups: 4.4, 4.5, 5.6, and 5.8 ● One (1) great speedups: 10.8 ! ● For the environments without durability (ND): ● Three (3) good speedups: 1.3, 1.5 and 1.8 ● Three (3) very good speedups: 2.4, 2.8 and 2.9 ● Two (2) great speedups: 3.7 and 4.8 ! ● All that without tuning MySQL or the application
  • 35. MySQL 8.0 – Looking at low speedups 35
  • 36. MySQL 8.0 – Looking at low speedups’ 36
  • 37. MySQL 8.0 – Looking at good speedups 37
  • 38. MySQL 8.0 – Looking at good speedups’ 38
  • 39. MySQL 8.0 – Looking at great speedups 39
  • 40. How do we tune slave_parallel_workers? Goal of tuning: find a value that gives good speedup without wasting too much resources We can use: - Commit rate - Replication catch-up speed But these are workload - dependent, making optimization harder: - Workload variations over time - Hardware differences
  • 41. Wouldn’t it be cool to have a workload - independent metric to help optimize slave_parallel_workers? Such a metric would: ● Have a good value when all of your applier threads are doing work ● Be sensitive to having idle threads. Sounds like a resource allocation fairness problem...
  • 42. In a fair world... ● slave_parallel_workers = n ● Each worker performs an equal amount of “work” Say n = 5
  • 43. Stuff that would not be good: ● No actual parallelism
  • 44. In an unfair world... ● Poor parallelism
  • 45. How do we define “work”? Inspired by Percona blog post by Stephane Combaudon (https://www.percona.com/blog/2016/02/10/estimating-potential-for-mysql-5-7-parallel-replication/) ● Percentage of commits applied per worker thread Easy to measure from performance schema: USE performance_schema; -- SET UP PS instrumentation UPDATE setup_consumers SET enabled = 'YES' WHERE NAME LIKE 'events_transactions%'; UPDATE setup_instruments SET enabled = 'YES', timed = 'YES' WHERE NAME = 'transaction'; -- Get information on commits per applier thread SELECT T.thread_id, T.count_star AS COUNT_STAR FROM events_transactions_summary_by_thread_by_event_name T WHERE T.thread_id IN (SELECT thread_id FROM replication_applier_status_by_worker);
  • 46. What does it give? > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63581 | 394 | | 63582 | 292 | | 63583 | 272 | | 63584 | 260 | | 63585 | 251 | | 63586 | 237 | [...] | 63606 | 108 | | 63607 | 106 | | 63608 | 104 | | 63609 | 100 | | 63610 | 98 | | 63611 | 94 | | 63612 | 89 | +-----------+------------+ 32 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63627 | 749 | | 63628 | 504 | | 63629 | 457 | | 63630 | 433 | | 63631 | 405 | | 63632 | 384 | [...] | 63684 | 73 | | 63685 | 73 | | 63686 | 71 | | 63687 | 68 | | 63688 | 65 | | 63689 | 65 | | 63690 | 62 | +-----------+------------+ 64 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63704 | 711 | | 63705 | 468 | | 63706 | 420 | | 63707 | 378 | | 63708 | 359 | | 63709 | 340 | [...] | 63825 | 3 | | 63826 | 3 | | 63827 | 3 | | 63828 | 3 | | 63829 | 3 | | 63830 | 2 | | 63831 | 2 | +-----------+------------+ 128 rows in set (0.00 sec)
  • 47. What does it give? > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63581 | 394 | | 63582 | 292 | | 63583 | 272 | | 63584 | 260 | | 63585 | 251 | | 63586 | 237 | [...] | 63606 | 108 | | 63607 | 106 | | 63608 | 104 | | 63609 | 100 | | 63610 | 98 | | 63611 | 94 | | 63612 | 89 | +-----------+------------+ 32 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63627 | 749 | | 63628 | 504 | | 63629 | 457 | | 63630 | 433 | | 63631 | 405 | | 63632 | 384 | [...] | 63684 | 73 | | 63685 | 73 | | 63686 | 71 | | 63687 | 68 | | 63688 | 65 | | 63689 | 65 | | 63690 | 62 | +-----------+------------+ 64 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63704 | 711 | | 63705 | 468 | | 63706 | 420 | | 63707 | 378 | | 63708 | 359 | | 63709 | 340 | [...] | 63825 | 3 | | 63826 | 3 | | 63827 | 3 | | 63828 | 3 | | 63829 | 3 | | 63830 | 2 | | 63831 | 2 | +-----------+------------+ 128 rows in set (0.00 sec)
  • 48. How can we understand those numbers: Jain’s index Jain’s fairness index: ● 𝑛: number of applier threads. ● 𝑖: the i-th applier thread. ● 𝑥𝑖 :amount of work performed by thread > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63627 | 749 | | 63628 | 504 | | 63629 | 457 | | 63630 | 433 | | 63631 | 405 | | 63632 | 384 | [...] | 63684 | 73 | | 63685 | 73 | | 63686 | 71 | | 63687 | 68 | | 63688 | 65 | | 63689 | 65 | | 63690 | 62 | +-----------+------------+ 64 rows in set (0.00 sec)
  • 49. What does Jain’s index mean? - 0 ≤ J ≤ 1 - For n = 1  J = 1 (but no parallelism) - If n > 1 and J ≅ 1, all workers are doing similar work - If n > 1 and J < 1, some workers are doing less work - If n > 1 and J << 1, some workers are idle Avoid J ≅ 1 And avoid J << 1 0 1
  • 50. Putting the idea to the test 5.7 Master 5.7 replica 8.0 Intermediate master 8.0 replicaGroup commit sync delay = 100 ms Group commit sync no delay count=50 Writeset enabled Replication stopped for 24 h Applier threads: 1 ≤ n ≤ 128 Replication stopped for 24 h Applier threads: 1 ≤ n ≤ 128
  • 51. First, something familiar to compare with: Optimal values for n: • For 5.7, n = 8 or 16 • For 8.0, n = 32 or 64 Higher values of n would be wasting resources on applier threads that have no work to do. 0 1000 2000 3000 4000 5000 6000 0 20 40 60 80 100 120 140 Commit rate vs n C / s (5.7) C / s (8.0)
  • 52. What does Jain´s index have to say? 0 0,2 0,4 0,6 0,8 1 1,2 0 20 40 60 80 100 120 140 J vs n J (5.7) J (8.0)
  • 53. Caveats... ● What happens when not catching up but just keeping up with current replication traffic? Not enough work means idle threads  Optimal number of workers depends on catching up or keeping up
  • 54. Another way to look at the numbers? > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63581 | 394 | | 63582 | 292 | | 63583 | 272 | | 63584 | 260 | | 63585 | 251 | | 63586 | 237 | [...] | 63606 | 108 | | 63607 | 106 | | 63608 | 104 | | 63609 | 100 | | 63610 | 98 | | 63611 | 94 | | 63612 | 89 | +-----------+------------+ 32 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63627 | 749 | | 63628 | 504 | | 63629 | 457 | | 63630 | 433 | | 63631 | 405 | | 63632 | 384 | [...] | 63684 | 73 | | 63685 | 73 | | 63686 | 71 | | 63687 | 68 | | 63688 | 65 | | 63689 | 65 | | 63690 | 62 | +-----------+------------+ 64 rows in set (0.00 sec) > SELECT [...]; +-----------+------------+ | thread_id | COUNT_STAR | +-----------+------------+ | 63704 | 711 | | 63705 | 468 | | 63706 | 420 | | 63707 | 378 | | 63708 | 359 | | 63709 | 340 | [...] | 63825 | 3 | | 63826 | 3 | | 63827 | 3 | | 63828 | 3 | | 63829 | 3 | | 63830 | 2 | | 63831 | 2 | +-----------+------------+ 128 rows in set (0.00 sec)
  • 55. MySQL 8.0 – What about commit order ? 55
  • 56. MySQL 8.0 – What about commit order ?’ 56
  • 57. MySQL 5.7 and 8.0 // Repl. Summary ● Parallel replication in MySQL 5.7 is not simple: ● Need precise tuning ● Long transactions block the parallel replication pipeline ● Care about Intermediate masters ● Write Set in MySQL 8.0 gives very interesting results: ● No problem with Intermediate masters ● Allows to test with Intermediate Master ● Some great speedups and most of them very good
  • 58. And please test by yourself and share results
  • 59. // Replication: Links ● Replication crash safety with MTS in MySQL 5.6 and 5.7: reality or illusion? https://jfg-mysql.blogspot.com/2016/01/replication-crash-safety-with-mts.html ● A Metric for Tuning Parallel Replication in MySQL 5.7 https://jfg-mysql.blogspot.com/2017/02/metric-for-tuning-parallel-replication-mysql-5-7.html ● Solving MySQL Replication Lag with LOGICAL_CLOCK and Calibrated Delay https://www.vividcortex.com/blog/solving-mysql-replication-lag-with-logical_clock-and-calibrated-delay ● How to Fix a Lagging MySQL Replication https://thoughts.t37.net/fixing-a-very-lagging-mysql-replication-db6eb5a6e15d ● Binlog Servers: ● http://blog.booking.com/mysql_slave_scaling_and_more.html ● Better Parallel Replication for MySQL: http://blog.booking.com/better_parallel_replication_for_mysql.html ● http://blog.booking.com/abstracting_binlog_servers_and_mysql_master_promotion_wo_reconfiguring_slaves.html 59
  • 60. // Replication: Links’ ● An update on Write Set (parallel replication) bug fix in MySQL 8.0 https://jfg-mysql.blogspot.com/2018/01/an-update-on-write-set-parallel-replication-bug-fix-in-mysql-8-0.html ● Write Set in MySQL 5.7: Group Replication https://jfg-mysql.blogspot.com/2018/01/write-set-in-mysql-5-7-group-replication.html ● More Write Set in MySQL: Group Replication Certification https://jfg-mysql.blogspot.com/2018/01/more-write-set-in-mysql-5-7-group-replication-certification.html 60
  • 61. // Replication: Links’’ ● Bugs/feature requests: ● The doc. of slave-parallel-type=LOGICAL_CLOCK wrongly reference Group Commit: Bug#85977 ● Allow slave_preserve_commit_order without log-slave-updates: Bug#75396 ● MTS with slave_preserve_commit_order not repl. crash safe: Bug#80103 ● Automatic Repl. Recovery Does Not Handle Lost Relay Log Events: Bug#81840 ● Expose, on the master/slave, counters for monitoring // info. quality: Bug#85965 & Bug#85966 ● Expose counters for monitoring Write Set barriers: Bug#86060 ● Deadlock with slave_preserve_commit_order=ON with Bug#86078: Bug#86079 & Bug#89247 ● Fixed bugs: ● Message after MTS crash misleading: Bug#80102 (and Bug#77496) ● Replication position lost after crash on MTS configured slave: Bug#77496 ● Full table scan bug in InnoDB: MDEV-10649, Bug#82968 and Bug#82969 ● The function reset_connection does not reset Write Set in WRITESET_SESSION: Bug#86063 ● Bad Write Set tracking with UNIQUE KEY on a DELETE followed by an INSERT: Bug#86078
  • 62. Thanks Eduardo Ortega (MySQL Database Engineer) eduardo DOT ortega AT booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagne AT booking.com