Bill Sickles – I’m a DBA working in the Dev/Ops group of UA Connected Fitness. Which initially started as MapMyFitness in 2006 and was acquired by UA in 2013.
Currently, we have ~40 million users of the MapMyFitness platform and with UA’s acquisition of both Endomondo and MyFitnessPal our combined users total over 190 million worldwide. As we look to consolidate and align these three platforms we continue to research and experiment with various technologies like PXC.
Percona Server – free, fully compatible, open source drop in replacement for MySQL. In continuous use at MMF as a primary data store since MMF’s inception 2006. Scaled from a few hundred users to ~40 million in the MapMyFitness space.
Codership Galera Library for MySQL – high availability solution for MySQL, synchronous replication, fault tolerant failover, multi-master and parallel replication, with “quick” (nothing is quick when your database is approaching a terra byte in size) relatively easy provisioning/scalability.
Percona XtraBackup – free and open source online backup solution for MySQL. Performs online, non-blocking, compressed backups on transactional systems (innodb).
One package – install via yum or apt-get on linux based systems, just need to add the Percona repo of choice.
Synchronous Replication: virtually synchronous, cluster uses a form of optimistic locking to apply transactions. Much faster than a two phase commit process, more like a fire and forget operation. A limitation is round trip time between cluster nodes.
Multi-master: you can read and write from any node in the cluster. Suggested to write to only one node, because of optimistic locking model, conflicting writes from other nodes will be rolled back on the losing node.
Parallel Applying: unlike MySQL async replication which is currently single threaded, once transactions are applied to a node you can have multiple threads apply those transactions on the other nodes. This may or may not work depending on your implementation of referential integrity, namely the use/misuse of foreign keys. These rules can be relaxed but you add some risks of data inconsistencies.
Data Consistency: each node has a full copy of the entire database, which is kept in-sync via synchronous replication. If an inconsistency is detected the node will be removed from the cluster.
Automatic node provisioning: very nice in a cloud environment, we are in AWS and have auto scale groups set up for our databases. When a new node comes online, it joins the cluster and gets a full copy of the existing database from one of the other nodes, using xtrabackup. The cluster manages which nodes are considered in-sync.
Highly Available: Unlike traditional master-master with slaves replication, any node in the cluster can accept writes and be considered the primary master database. So no need for the drama of failover and slave promotion of a slave or the need to preform a failover from one master to another. Either scenario typically requires or causes downtime and involves the risk of breaking MySQL replication. Since cluster is synchronous and has multi-master capabilities, failover can happen at anytime without the inherent risks of traditional MySQL replication topologies.
We’ve decided to test this out.
Our first project was a Managed MySQL Service using PXC as the database back-end
Fully Managed: in house by engineers that care about this single service and it’s individual clients. Want to remove the burden of developers having to set requirements for instance size, how much storage expected, provisioned IOPs, which Availability zones/subnets/VPC’s, etc?
Simple to Deploy: request a new database, Ops creates it or maybe an app in the future. We give the developer an endpoint and a set users for access to a MySQL database. Plus a kibana dashboard, so they can monitor how things are working, slow queries, too many queries, etc. DevOps has instrumentation to determine total load on all of the nodes for scaling purposes.
Easy to Scale: add more nodes, split the cluster up, all behind ELB and HAProxy
Replication: multi-master, synchronous, Global Transaction IDs, plus we can use asynchronous MySQL replication to build out additional slave nodes for increased read traffic, or replicate from external clusters/servers into this cluster or out of it for regional capabilities.
Reliable: no single points of failure, reliable crossover/switch between servers, availability zones and (in the future) regions, with good failure detection to reduce impact of crossovers/switches of resources
ELB->HAProxy->Percona Cluster spread, out over multiple Availability zones in the AWS cloud
When databases are created on the cluster, we make haproxy config changes to route writes to one node in the cluster and reads are sent to the remaining nodes in the cluster.
Writes for database1 go to node1, reads for database1 go to nodes 2 and 3
Writes for database2 go to node2, reads for database2 go to nodes 1 and 3
Writes for database3 go to node3, reads for database3 go to nodes 1 and 2
If we had a database4
Writes for database4 go to node1, reads for database4 go to nodes 2 and 3
…
If we run into cluster members dropping out because of read or write loads, we can add more cluster nodes and continue to spread those loads over more nodes or add dedicated MySQL async read slaves. At some point the size of the cluster becomes a problem for galera replication (research indicates ~9 nodes) and we will need to split the cluster up into several clusters. The system and database configurations are in SALT so it is a fairly easy exercise to create another cluster. Routing of queries is handled by the ELB and HAProxy servers so adding and removing cluster nodes and async slaves, or whole clusters will not be a problem. The application still uses the same endpoint, but that endpoint would be pointed to another cluster.
Synchronous Replication: virtually synchronous, cluster uses a form of optimistic locking to apply transactions to the master. Much faster than a two phase commit process, more like a fire and forget operation.
– helps scale out reads on the cluster
Multi-master: you can read and write from any node in the cluster. Suggested to write to only one node, because of optimistic locking model, conflicting writes from other nodes will be rolled back on the losing node.
-- Writing to a single node for each database avoids potential roll backs and data conflicts, and writing to all nodes makes better use of the infrastructure.
Parallel Applying: unlike MySQL async replication which is single threaded, once transactions are applied to a node you can have multiple threads apply those transactions on the other nodes. This may or may not work depending on your implementation of referential integrity, namely the use/misuse of foreign keys. These rules can be relaxed but you add some risks of data inconsistencies. -- increase the level of concurrency between writes and reads on individual nodes.
Data Consistency: each node has a full copy of the entire database, which is kept in-sync via synchronous replication. If an inconsistency is detected the node will be removed from the cluster.
-- HAProxy in concert with local Xinetd scripts can tell when a node is inconsistent or out of sync using galera’s internal status variables.
Automatic node provisioning: very nice in a cloud environment, we are in AWS and have auto scale groups set up for our databases. When a new node comes online, it joins the cluster and gets a full copy of the existing database from one of the other nodes, using xtrabackup. The cluster manages which nodes are considered in-sync.
-- allows scaling to happen behind the scenes
Highly Available: Unlike traditional master-master with slaves replication, any node in the cluster can accept writes and be considered the primary master database. So no failover and promotion of a slave to be the new master or the need to preform a failover from one master to another. Either scenario requires downtime and involves the risk of breaking MySQL replication. Since cluster is synchronous and multi-master capabilities, failover can happen at anytime without the inherent risks of traditional MySQL replication topologies.
-- reliable crossover/switches between servers