Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Lessons Learned: Troubleshooting Replication

1.119 Aufrufe

Veröffentlicht am

Slides for M17 conference in New York, April, 11, 2017

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

Lessons Learned: Troubleshooting Replication

  1. 1. Lessons Learned: Troubleshooting Replication April, 11, 2017 Sveta Smirnova
  2. 2. ∙ MySQL Support engineer ∙ Author of ∙ MySQL Troubleshooting ∙ JSON UDF functions ∙ FILTER clause for MySQL ∙ Speaker ∙ Percona Live, OOW, Fosdem, DevConf, HighLoad... Sveta Smirnova 2
  3. 3. ∙Typical Replication Errors ∙MySQL and MariaDB Replication: Must Know ∙Master ∙Slave IO thread ∙Slave SQL thread ∙Multithreaded slave ∙Multi-master Table of Contents 3
  4. 4. Typical Replication Errors
  5. 5. Replication Stopped 5
  6. 6. Slave Lags from the Master 6
  7. 7. Master Increases Resource Usage 7
  8. 8. Not a Full List 8
  9. 9. MySQL and MariaDB Replication: Must Know
  10. 10. Master Slave <- Initiates Asynchronous 10
  11. 11. Master Slave <- Initiates <- Requests a packet Asynchronous 10
  12. 12. Master Sends the packet -> Slave <- Initiates <- Requests a packet Asynchronous 10
  13. 13. Master Sends the packet -> Slave <- Initiates <- Requests a packet ... ? Asynchronous 10
  14. 14. Master Slave <- Initiates Semisynchrous Plugin 11
  15. 15. Master Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  16. 16. Master Sends the packet -> Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  17. 17. Master Sends the packet -> Waits "Ack" Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  18. 18. Master Sends the packet -> Waits "Ack" Slave <- Initiates <- Requests a packet <- Sends "Ack" Semisynchrous Plugin 11
  19. 19. Master Recieves a change Storage Engine Logical 12
  20. 20. Master Recieves a change Sends to SE -> Storage Engine Logical 12
  21. 21. Master Recieves a change Sends to SE -> Storage Engine Writes into table Logical 12
  22. 22. Master Recieves a change Sends to SE -> Storage Engine Writes into table <- Returns control Logical 12
  23. 23. Master Recieves a change Sends to SE -> Writes into binary log Storage Engine Writes into table <- Returns control Logical 12
  24. 24. Master Recieves a change Sends to SE -> Writes into binary log Synchronizes -> Storage Engine Writes into table <- Returns control <- Synchronizes Logical 12
  25. 25. IO thread Reads from the master SQL thread Two Kinds of Slave Threads 13
  26. 26. IO thread Reads from the master Stores in the relay log SQL thread Two Kinds of Slave Threads 13
  27. 27. IO thread Reads from the master Stores in the relay log SQL thread <- Reads from the relay log Two Kinds of Slave Threads 13
  28. 28. IO thread Reads from the master Stores in the relay log SQL thread <- Reads from the relay log Executes Two Kinds of Slave Threads 13
  29. 29. ∙ Multiple SQL threads in 10.0.5+/5.6+ Several SQL Threads 14
  30. 30. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread Several SQL Threads 14
  31. 31. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log Several SQL Threads 14
  32. 32. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log ∙ Slave lag still possible Several SQL Threads 14
  33. 33. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log ∙ Slave lag still possible ∙ Error in one thread stops all Several SQL Threads 14
  34. 34. ∙ Multiple masters in 10.0.1+/5.7+ Multi-Source (Multi-Channel) 15
  35. 35. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs Multi-Source (Multi-Channel) 15
  36. 36. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads Multi-Source (Multi-Channel) 15
  37. 37. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads Multi-Source (Multi-Channel) 15
  38. 38. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel Multi-Source (Multi-Channel) 15
  39. 39. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent Multi-Source (Multi-Channel) 15
  40. 40. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent ∙ Error in one stops only one Multi-Source (Multi-Channel) 15
  41. 41. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent ∙ Error in one stops only one ∙ No automatic conflict resolution Multi-Source (Multi-Channel) 15
  42. 42. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position Position-Based 16
  43. 43. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position Position-Based 16
  44. 44. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip Position-Based 16
  45. 45. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip ∙ Easy to move position backward Position-Based 16
  46. 46. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip ∙ Easy to move position backward ∙ No conflict resolution Position-Based 16
  47. 47. ∙ Each transaction has unique number: GTID Global Transaction Identifiers (GTID) 17
  48. 48. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 Global Transaction Identifiers (GTID) 17
  49. 49. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 ∙ MariaDB: master_use_gtid = { slave_pos | current_pos } Global Transaction Identifiers (GTID) 17
  50. 50. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 ∙ MariaDB: master_use_gtid = { slave_pos | current_pos } ∙ No need to specify binary log and position Global Transaction Identifiers (GTID) 17
  51. 51. Client Binary log Statement-Based Binary Log Format 18
  52. 52. Client INSERT INTO ... -> Binary log Statement-Based Binary Log Format 18
  53. 53. Client INSERT INTO ... -> Binary log SET TIMESTAMP... Statement-Based Binary Log Format 18
  54. 54. Client INSERT INTO ... -> Binary log SET TIMESTAMP... SET sql_mode... Statement-Based Binary Log Format 18
  55. 55. Client INSERT INTO ... -> Binary log SET TIMESTAMP... SET sql_mode... INSERT INTO ... Statement-Based Binary Log Format 18
  56. 56. Client Binary log Row-Based Binary Log Format 19
  57. 57. Client UPDATE ... -> Binary log Row-Based Binary Log Format 19
  58. 58. Client UPDATE ... -> Binary log SET TIMESTAMP... Row-Based Binary Log Format 19
  59. 59. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row-Based Binary Log Format 19
  60. 60. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row before changes Row-Based Binary Log Format 19
  61. 61. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row before changes Row with changes Row-Based Binary Log Format 19
  62. 62. ∙ Error log file Main Instruments 20
  63. 63. ∙ Error log file ∙ At the slave ∙ SHOW SLAVE STATUS ∙ MySQL: Tables in Performance Schema ∙ System database mysql Main Instruments 20
  64. 64. ∙ Error log file ∙ At the slave ∙ At the master ∙ SHOW MASTER STATUS ∙ SHOW BINLOG EVENTS ∙ mysqlbinlog Main Instruments 20
  65. 65. ∙ Error log file ∙ At the slave ∙ At the master ∙ Percona Toolkit Main Instruments 20
  66. 66. ∙ Error log file ∙ At the slave ∙ At the master ∙ Percona Toolkit ∙ MySQL Utilities Main Instruments 20
  67. 67. ∙ Always available, requires setup ∙ Asynchronous ∙ Master∙ Keeps all changes in the binary log Two formats: ROW и STATEMENT ∙ Slave ∙ IO thread reads from the master into relay log ∙ SQL thread executes updates Multiple SQL threads in 10.0.5+/5.6+ Multiple channels/sources (masters) in 10.0.1+/5.7+ ∙ GTID in 10.0.2+/5.6+ Replication Must Know: Summary 21
  68. 68. Master
  69. 69. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB Performance 23
  70. 70. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB ∙ binlog_cache_size Watch Binlog_cache_disk_use Performance 23
  71. 71. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB ∙ binlog_cache_size Watch Binlog_cache_disk_use ∙ binlog_stmt_cache_size Watch Binlog_stmt_cache_disk_use Performance 23
  72. 72. ∙ More writes ∙ Synchronization ∙ sync_binlog ∙ Do not disable! ∙ You may set it greater than 1 Performance 23
  73. 73. ∙ Binary log lifetime ∙ expire_log_days Behavior 24
  74. 74. ∙ Binary log lifetime ∙ Synchronization ∙ SBR is not safe with READ COMMITTED and READ UNCOMMITTED Behavior 24
  75. 75. ∙ Binary log lifetime ∙ Synchronization ∙ Order of records in the binary log ∙ Non-deterministic events and SBR Behavior 24
  76. 76. Slave IO thread
  77. 77. ∙ SHOW SLAVE STATUS Slave_IO_Running: Connecting Slave_SQL_Running: Yes ... Last_IO_Errno: 1045 Last_IO_Error: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 1 Last_SQL_Errno: 0 Last_SQL_Error: ... Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 160824 03:18:36 Last_SQL_Error_Timestamp: Network 26
  78. 78. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status mysql> select * from performance_schema.replication_connection_statusG *************************** 1. row *************************** CHANNEL_NAME: GROUP_NAME: SOURCE_UUID: THREAD_ID: NULL SERVICE_STATE: CONNECTING COUNT_RECEIVED_HEARTBEATS: 0 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00 RECEIVED_TRANSACTION_SET: LAST_ERROR_NUMBER: 1045 LAST_ERROR_MESSAGE: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 4 LAST_ERROR_TIMESTAMP: 2016-08-24 03:21:36 1 row in set (0,01 sec) Network 26
  79. 79. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log 2016-08-24T00:18:36.077384Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 1, Error_code: 1045 2016-08-24T00:19:36.299011Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 2, Error_code: 1045 2016-08-24T00:20:36.485315Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 3, Error_code: 1045 2016-08-24T00:21:36.677915Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 4, Error_code: 1045 2016-08-24T00:22:36.872066Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 5, Error_code: 1045 Network 26
  80. 80. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access $ perror 1045 MySQL error code 1045 (ER_ACCESS_DENIED_ERROR): Access denied for user ’%-.48s’@’%-.64s’ (using password: %s) Network 26
  81. 81. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password $ mysql -h127.0.0.1 -P13000 -uroot -pbar Warning: Using a password on the command line interface can be insecure. ERROR 1045 (28000): Access denied for user ’root’@’localhost’ (using password: YES) Network 26
  82. 82. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password SHOW GRANTS mysql> SHOW GRANTS; +----------------------------------+ | Grants for foo@% | +----------------------------------+ | GRANT SELECT ON *.* TO ’foo’@’%’ | +----------------------------------+ Network 26
  83. 83. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password SHOW GRANTS ∙ Fix privileges on the master ∙ Restart slave Network 26
  84. 84. ∙ Regular performance troubleshooting ∙ Check with command line client ∙ Troubleshooting hardware resource usage webinar Performance 27
  85. 85. Slave SQL thread
  86. 86. ∙ One master - one slave ∙ Different data Slave cannot execute event from the relay log ∙ Different errors on master and slave ∙ Slave lags behind the master SQL thread: typical issues 29
  87. 87. ∙ One master - one slave ∙ Different data Slave cannot execute event from the relay log ∙ Different errors on master and slave ∙ Slave lags behind the master ∙ Circle replication and other writes in addition to SQL thread ∙ Different data SQL thread: typical issues 29
  88. 88. ∙ Did table change outside of the replication? ∙ How? ∙ Can it cause conflict with changes on the master? Different Data 30
  89. 89. ∙ Did table change outside of the replication? ∙ Are table structures identical? ∙ Percona Toolkit pt-table-checksum, pt-table-sync ∙ MySQL Utilities MySQL: mysqlrplsync mysqldbcompare, mysqldiff Different Data 30
  90. 90. ∙ Did table change outside of the replication? ∙ Are table structures identical? ∙ Are changes in the correct order? ∙ mysqlbinlog ∙ Application logic on the master Different Data 30
  91. 91. ∙ Only with SBR Updates in the wrong order 31
  92. 92. ∙ Only with SBR ∙ Row-level locks Updates in the wrong order 31
  93. 93. ∙ Only with SBR ∙ Row-level locks ∙ Triggers ∙ SET GLOBAL sql_slave_skip_counter – No GTIDs! ∙ Skip transaction – GTIDs ∙ Synchronize tables! Updates in the wrong order 31
  94. 94. ∙ Only with SBR ∙ Row-level locks ∙ Triggers ∙ Different options: for old versions ∙ Start slave with master’s options ∙ Restart SQL thread ∙ Most issues are fixed in recent versions Updates in the wrong order 31
  95. 95. ∙ Threads ∙ Master executes changes in multiple threads ∙ Slave uses one Slave lags from the master 32
  96. 96. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! Slave lags from the master 32
  97. 97. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! ∙ Tune slave performance ∙ Multi-threaded slave One thread for one database in MySQL 5.6 There may be conflicts between multiple slave SQL threads Slave lags from the master 32
  98. 98. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! ∙ Tune slave performance ∙ Multi-threaded slave One thread for one database in MySQL 5.6 There may be conflicts between multiple slave SQL threads ∙ Indexes on the slave Makes sense for SBR only Slave lags from the master 32
  99. 99. Multithreaded slave
  100. 100. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master Performance 34
  101. 101. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MySQL: slave_parallel_workers Performance 34
  102. 102. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MySQL: slave_parallel_workers ∙ MySQL: slave_parallel_type=DATABASE | LOGICAL_CLOCK Performance 34
  103. 103. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads Performance 34
  104. 104. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued Performance 34
  105. 105. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued ∙ MariaDB: slave_domain_parallel_threads Performance 34
  106. 106. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued ∙ MariaDB: slave_domain_parallel_threads ∙ MariaDB: slave_parallel_mode=optimistic | conservative | aggressive | minimal | none Performance 34
  107. 107. ∙ Same methods as for single-threaded Wrong Behavior 35
  108. 108. ∙ Same methods as for single-threaded ∙ Error of one thread stops all mysql> select WORKER_ID, SERVICE_STATE, LAST_SEEN_TRANSACTION, LAST_ERROR_NUMBER, -> LAST_ERROR_MESSAGE from performance_schema.replication_applier_status_by_workerG *************************** 1. row *************************** WORKER_ID: 1 SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4988 LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: *************************** 2. row *************************** WORKER_ID: 3 SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4986 LAST_ERROR_NUMBER: 1032 LAST_ERROR_MESSAGE: Worker 2 failed executing transaction... Wrong Behavior 35
  109. 109. ∙ Same methods as for single-threaded ∙ Error of one thread stops all MariaDB [test]> select id, command, time, state, progress from information_schema.processlist -> where user=’system user’; +----+---------+------+------------------------------------------------------------------+ | id | command | time | state | +----+---------+------+------------------------------------------------------------------+ | 25 | Connect | 4738 | Waiting for master to send event | | 24 | Connect | 5096 | Slave has read all relay log; waiting for the slave I/O thread t | | 23 | Connect | 0 | Waiting for work from SQL thread | | 22 | Connect | 0 | Unlocking tables | | 21 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 20 | Connect | 0 | Waiting for prior transaction to start commit before starting ne | | 19 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 18 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 17 | Connect | 0 | Update_rows_log_event::find_row(-1) ... Wrong Behavior 35
  110. 110. Multi-master
  111. 111. ∙ Replication must be set for each channel/source Specifics 37
  112. 112. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time Specifics 37
  113. 113. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication Specifics 37
  114. 114. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication ∙ MySQL: Filters work for all channels Specifics 37
  115. 115. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication ∙ MySQL: Filters work for all channels ∙ MariaDB: You may setup filters for each source Specifics 37
  116. 116. Summary
  117. 117. ∙ Issues on the master ∙ Same as for standalone server ∙ More writes and consistency checks Summary 39
  118. 118. ∙ Issues on the master ∙ Slave IO thread ∙ Common network issues ∙ mysql command line client for tests Summary 39
  119. 119. ∙ Issues on the master ∙ Slave IO thread ∙ Slave SQL thread ∙ Regular query-related issues ∙ Regular storage engine issues ∙ Less execution threads than on master Summary 39
  120. 120. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar More Information 40
  121. 121. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar ∙ Percona Monitoring and Management ∙ Percona Toolkit ∙ MySQL Utilities More Information 40
  122. 122. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar ∙ Percona Monitoring and Management ∙ Percona Toolkit ∙ MySQL Utilities ∙ Book MySQL High Availability ∙ MySQL Replication Team blog ∙ Replication in MariaDB More Information 40
  123. 123. ??? Time for questions 41
  124. 124. http://www.slideshare.net/SvetaSmirnova https://twitter.com/svetsmirnova https://github.com/svetasmirnova Thank You! 42
  125. 125. Appendix
  126. 126. Appendix Replication in Details
  127. 127. ∙ Is data up to date? ∙ Are these same? ∙ Table structures ∙ Storage Engine ∙ Data ∙ Any write can break replication Asynchronous: Slave Q&A 45
  128. 128. ∙ Writes on master are slower than in case of asynchronous Semi-synchrous Replication 46
  129. 129. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? Semi-synchrous Replication 46
  130. 130. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave Semi-synchrous Replication 46
  131. 131. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave ∙ Now in MySQL: rpl_semi_sync_master_wait_for_slave_count Semi-synchrous Replication 46
  132. 132. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave ∙ Now in MySQL: rpl_semi_sync_master_wait_for_slave_count ∙ Would not wait others Semi-synchrous Replication 46
  133. 133. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? Semi-synchrous Replication 46
  134. 134. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ Event is written into relay log Semi-synchrous Replication 46
  135. 135. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ Event is written into relay log ∙ No guarantee it is executed Semi-synchrous Replication 46
  136. 136. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ What happens in case of timeout? Semi-synchrous Replication 46
  137. 137. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ What happens in case of timeout? ∙ Replication becomes asynchronous Semi-synchrous Replication 46
  138. 138. ∙ Every change written twice: SE files: logs, data, ... Binary Log Logical Replication 47
  139. 139. ∙ Every change written twice: SE files: logs, data, ... Binary Log ∙ You can write on slave Logical Replication 47
  140. 140. ∙ Does not exist in MySQL/MariaSB! Just for Comparison: Physical Replication 48
  141. 141. ∙ Does not exist in MySQL/MariaSB! ∙ There are two closed-source solutions Just for Comparison: Physical Replication 48
  142. 142. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files Just for Comparison: Physical Replication 48
  143. 143. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files ∙ Which are replicated to slave Just for Comparison: Physical Replication 48
  144. 144. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files ∙ Which are replicated to slave ∙ From the troubleshooting point of view ∙ IO: changes are written only once ∙ You cannot write on slave in parallel ∙ Any data inconsistency leads to replication break Just for Comparison: Physical Replication 48
  145. 145. ∙ Data transfer ∙ Execution ∙ Different ∙ Diagnostics ∙ Fixes Two Kinds of Threads – Two Kinds of Issues 49
  146. 146. ∙ Guaranteed that every transaction will be executed only once GTID 50
  147. 147. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover GTID 50
  148. 148. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction GTID 50
  149. 149. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx GTID 50
  150. 150. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx ∙ MariaDB: set global gtid_slave_pos=’X-Y-Z’; GTID 50
  151. 151. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx ∙ MariaDB: set global gtid_slave_pos=’X-Y-Z’; ∙ Be careful with expire_logs_days! GTID 50
  152. 152. ∙ Statement-based (SBR) ∙ Queries are written as received Binary Log Formats 51
  153. 153. ∙ Statement-based (SBR) ∙ Queries are written as received ∙ There is a risk of data inconsistency (non-safe) INSERT IGNORE LIMIT without ORDER BY Non-deterministic functions ... Binary Log Formats 51
  154. 154. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Usually more data are written IO Transfer speed binlog_row_image Binary Log Formats 51
  155. 155. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Usually more data are written IO Transfer speed binlog_row_image ∙ Performance may be worse if table does not have primary (unique) key MariaDB may use any index Binary Log Formats 51
  156. 156. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Mixed ∙ Advantages of both formats Binary Log Formats 51
  157. 157. Appendix Main Tools by Example
  158. 158. ∙ Slave start Error log 53
  159. 159. ∙ Slave start ∙ Errors 2016-08-23T12:11:21.867440Z 4 [ERROR] Slave SQL for channel ’master-1’: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213, Error_code: 1032 2016-08-23T12:11:21.867471Z 4 [Warning] Slave: Can’t find record in ’t1’ Error_code: 1032 2016-08-23T12:11:21.867484Z 4 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log ’master-bin.000001’ position 989 Error log 53
  160. 160. ∙ Slave start ∙ Errors ∙ Slave stop Error log 53
  161. 161. All information about slave ∙ IO thread Configuration ∙ SQL thread Configuration ∙ IO thread Status ∙ SQL thread Status ∙ Errors Only last one All are in the error log SHOW SLAVE STATUS 54
  162. 162. mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 ... Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 63810611 Relay_Log_File: slave-relay-bin-master@002d1.000004 Relay_Log_Pos: 1156 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No ... Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Skip_Counter: 0 ... SHOW SLAVE STATUS 54
  163. 163. ∙ No need to parse SHOW Tables in Performance Schema 55
  164. 164. ∙ No need to parse SHOW ∙ Configuration ∙ replication_connection_configuration ∙ replication_applier_configuration ∙ mysql> select * from replication_connection_configuration -> join replication_applier_configuration using(channel_name); Tables in Performance Schema 55
  165. 165. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ replication_connection_status Tables in Performance Schema 55
  166. 166. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ SQL thread status ∙ replication_applier_status ∙ replication_applier_status_by_coordinator - MTS! mysql> select * from replication_applier_status join -> replication_applier_status_by_coordinator -> using(channel_name); Tables in Performance Schema 55
  167. 167. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ SQL thread status ∙ replication_applier_status ∙ replication_applier_status_by_worker mysql> select * from replication_applier_status join -> replication_applier_status_by_worker -> using(channel_name); Tables in Performance Schema 55
  168. 168. ∙ Master Info mysql> select * from slave_master_infoG *************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysqld-bin.000001 Master_log_pos: 154 Host: 127.0.0.1 User_name: root User_password: secret Port: 13000 Connect_retry: 60 Enabled_ssl: 0 ... Uuid: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f Retry_count: 86400 ... Enabled_auto_position: 1 ... system database mysql: only on the slave 56
  169. 169. ∙ Master Info ∙ Relay log info mysql> select * from slave_relay_log_infoG *************************** 1. row *************************** Number_of_lines: 7 Relay_log_name: ./slave-relay-bin-master@002d1.000004 Relay_log_pos: 1156 Master_log_name: master-bin.000001 Master_log_pos: 989 Sql_delay: 0 Number_of_workers: 0 Id: 1 Channel_name: master-1 system database mysql: only on the slave 56
  170. 170. ∙ Master Info ∙ Relay log info ∙ Worker info: multi-threaded slave mysql> select * from slave_worker_infoG *************************** 1. row *************************** Id: 1 ... *************************** 8. row *************************** Id: 8 Relay_log_name: ./Thinkie-relay-bin.000004 Relay_log_pos: 1216 Master_log_name: mysqld-bin.000001 Master_log_pos: 1342 Checkpoint_relay_log_name: ./Thinkie-relay-bin.000004 Checkpoint_relay_log_pos: 963 Checkpoint_master_log_name: mysqld-bin.000001 system database mysql: only on the slave 56
  171. 171. mysql> show master statusG *************************** 1. row *************************** File: master-bin.000005 Position: 154 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0,00 sec) SHOW MASTER STATUS 57
  172. 172. mysql> show binlog events in ’master-bin.000001’ from 989; +-------------------+------+----------------+-----------+-------------+-------------------------------- | Log_name | Pos | Event_type | Server_id | End_log_pos | Info +-------------------+------+----------------+-----------+-------------+-------------------------------- | master-bin.000001 | 989 | Anonymous_Gtid | 1 | 1054 | SET @@SESSION.GTID_NEXT= ... | master-bin.000001 | 1054 | Query | 1 | 1124 | BEGIN | master-bin.000001 | 1124 | Table_map | 1 | 1167 | table_id: 109 (m2.t1) | master-bin.000001 | 1167 | Update_rows | 1 | 1213 | table_id: 109 flags: STMT_END_F | master-bin.000001 | 1213 | Xid | 1 | 1244 | COMMIT /* xid=64 */ +-------------------+------+----------------+-----------+-------------+-------------------------------- 5 rows in set (0,00 sec) SHOW BINLOG EVENTS 58
  173. 173. $ mysqlbinlog var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213 ... # at 1167 #160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; ... mysqlbinlog 59
  174. 174. $ mysqlbinlog -v var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213 ... # at 1167 #160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ### UPDATE ‘m2‘.‘t1‘ ### WHERE ### @1=5 ### SET ### @1=6 ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; ... mysqlbinlog 60
  175. 175. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency Toolkits 61
  176. 176. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency ∙ pt-table-sync Fixes data inconsistencies Toolkits 61
  177. 177. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency ∙ pt-table-sync Fixes data inconsistencies ∙ pt-slave-find Shows topology Toolkits 61
  178. 178. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate Toolkits 61
  179. 179. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology Toolkits 61
  180. 180. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology ∙ mysqlrplsync Checks data consistency Toolkits 61
  181. 181. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology ∙ mysqlrplsync Checks data consistency ∙ mysqlslavetrx Skips 1-N transactions Toolkits 61
  182. 182. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly Toolkits 61
  183. 183. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly ∙ mysqldiff Checks objects definitions MariaDB-friendly Toolkits 61
  184. 184. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly ∙ mysqldiff Checks objects definitions MariaDB-friendly ∙ mysqlserverinfo Shows main options, such as port and datadir Replication-oriented MariaDB-friendly Toolkits 61
  185. 185. ∙ Error log ∙ Slave ∙ SHOW SLAVE STATUS ∙ Tables in Performance Schema ∙ Tables in mysql database ∙ Master ∙ SHOW MASTER STATUS ∙ SHOW BINLOG EVENTS ∙ mysqlbinlog ∙ mysql command line client Main Instruments: Summary 62

×