2. What is Replication for? High availability If a node fails, another node can step in Extra copies of data for recovery Scaling reads Applications with high read rates can read from replicas
3. What Does Replication Look Like? Replica Set A set of mongod servers Minimum of 3 Can use “arbiters” Consensus election of a “primary” All writes go to primary “Secondaries” replicate from primary
5. Managing a Replica Set --oplogSize <MB> mongod option: choose size of replication log --fastsync mongod option: skip full resync of new secondary that was created from a recent copy rs.add(“hostname:<port>”) Shell helper: add a new member rs.remove(“hostname:<port>”) Shell helper: remove a member
6. Some Administrative Commands replSetGetStatus Reports status of the replica set from one node’s point of view replSetStepDown Request the primary to step down replSetFreeze Prevents any changes to the current replica set configuration (primary/secondary status) Use during backups
7. How Does it Work? Change operations are written to the oplog The oplog is a capped collection Must have enough space to allow new secondaries to catch up after copying from a primary Must have enough space to cope with any applicable slaveDelay Secondaries query the primary’s oplog and apply what they find
8. Failover Replica set members monitor each other via heartbeats If the primary can’t be reached, a new one is elected The secondary with the most up-to-date oplog is chosen If, after election, a secondary has changes not on the new primary, those are undone, and moved aside (changes saved to a BSON file) If you require a guarantee, ensure data is written to a majority of the replica set
9. Priority Optional parameter to replica set member configuration All other things being equal, the highest priority member wins the election for primary Changes in secondaries’ relative lag, i.e., catching up to primary, can trigger an election Zero priority: can never become primary Use for remote DR, delayed slaves, backups, analytics sources
10. For Applications getLastError( { w : … } ) Application waits until changes are written to the specified number of servers Defaults can be set in the replica set’s configuration “Safe mode” for critical writes: setWriteConcern() Another way to force writes to a number of servers Drivers support “slaveOk” for sending queries to a secondary This is for scaling reads
11. Replication and Sharding Each shard needs its own replica set Drivers use a mongos process to route queries to the appropriate shard(s) Configuration servers maintain the shard key range metadata
13. Data Center Awareness Tag nodes in replica set configuration Apply hierarchical labels to replica set members Define getLastError modes Require number of nodes writes must go to Require locations of nodes writes must go to Combinations Available in 1.9.1