The presentation describes reasons for selecting multi-cloud operation approach and provides an overview of implementation challenges and how they can be addressed
- Thus, the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.
- In practice, collisions are reported,[17][18][19] such incidents are considered as software bugs.
there are 6 versions of UUID
Thus, the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.
In practice, collisions are reported,such incidents are considered as software bugs.
Great research from Peter Zaitsev (founder of Percona) on using UUID with MySQL
Why 16-bytes are imported? Here we need to understand how indices in MySQL/InnoDb are working.
In InnoDB your primary key (index) is copied into secondary index
If you present UUID as string, it will take 36-bytes
There could replication spikes that show more than 1s delay in replication, but generally such spikes go away within few seconds.
Even if servers are located far away
Light travels through empty space at ~300000 km per second. The speed of electricity in copper is 95.1% the speed of light.
There are plenty of articles about zero-downtime migrations, but briefly:
When you are adding a column, you must add it with default/or NULL value
Execute migrations first
Deploy code to servers
Be aware that if you are changing schema/table that is synced between Clouds, your ALTER will be replication to another Cloud
When you are removing column
First you need to make sure that column is not used by anyone
And only then deploy migration that removes your code
When you need to update a column
If it has same datatype, no special actions are required
Most complicated is situation when you need to change column type… if you are interested in how to do it, ask me after
To avoid table locks, we’ve upgrade our system to MySQL from 5.5 -> 5.7 which support Online DDL Operations
Make sure you do no hardcode any hostname into your application
Google has a good book about Site Reliability , where they have a chapter about Distributed CRON
Some times it’s required to enable multi threaded replication so that your slave would catch-up with MASTER and Seconds Behind Master would be 0
So when we execute ALTER on customer related table, it will be replicated from MASTER binlog into SLAVES relaylog and it’s okay when your running single thread for replication, but when you are running multiple threads (workers) how to user that work haven’t execute ROW/STATEMENT from relaylog, you need to enable GTID which stands for Global Transaction Identifier, it’s associated with each transaction you COMMIT, so when works will read from relaylog what to apply, they will verify that this GTID is not executed by another workers
CAP theorem - Consistency / Availability / Partition tolerance (network partition)
When all the above is done, we need to implement traffic balancing and Konstantin will talk about it