2. Notes to the presenter
Themes for this presentation:
â˘âŻ Balance between cost and redundancy.
â˘âŻ Cover the many scenarios which replication
would solve and why.
â˘âŻ Secret sauce of avoiding downtime & data loss
â˘âŻ If time is short, skip 'Behind the curtain' section
â˘âŻ If time is very short, skip Operational
Considerations also
4. Why Replication?
â˘âŻ How many have faced node failures?
â˘âŻ How many have been woken up from sleep to do
a fail-over(s)?
â˘âŻ How many have experienced issues due to
network latency?
â˘âŻ Different uses for data
â⯠Normal processing
â⯠Simple analytics
21. Client Application
Driver
Write
ad
Re
Re
ad
Primary
Secondary Secondary
Delayed Consistency
22. Write Concern
â˘âŻ Network acknowledgement
â˘âŻ Wait for error
â˘âŻ Wait for journal sync
â˘âŻ Wait for replication
23. Driver
write
Primary
apply in
memory
Unacknowledged
24. Driver
getLastError
Primary
apply in
memory
MongoDB Acknowledged (wait for error)
25. Driver
getLastError
j:true
write
Primary
apply in write to
memory journal
Wait for Journal Sync
26. Driver
getLastError
write
w:2
Primary
replicate
apply in
memory
Secondary
Wait for Replication
27. Tagging
â˘âŻ Control where data is written to, and read from
â˘âŻ Each member can have one or more tags
â⯠tags: {dc: "ny"}
â⯠tags: {dc: "ny",
subnet: "192.168",
rack: "row3rk7"}
â˘âŻ Replica set deďŹnes rules for write concerns
â˘âŻ Rules can change without changing app code
29. Driver
getLastError
W:allDCs
write
Primary (SF)
replicate
apply in
memory
Secondary (NY)
replicate
Secondary (Cloud)
Wait for Replication (Tagging)
30. Read Preference Modes
â˘âŻ 5 modes
â⯠primary (only) - Default
â⯠primaryPreferred
â⯠secondary
â⯠secondaryPreferred
â⯠Nearest
When more than one node is possible, closest node is used for
reads (all modes but primary)
31. Tagged Read Preference
â˘âŻ Custom read preferences
â˘âŻ Control where you read from by (node) tags
â⯠E.g. { "disk": "ssd", "use": "reporting" }
â˘âŻ Use in conjunction with standard read preferences
â⯠Except primary
33. Maintenance and Upgrade
â˘âŻ No downtime
â˘âŻ Rolling upgrade/maintenance
â⯠Start with Secondary
â⯠Primary last
34. Replica Set â 1 Data Center
â˘âŻ Single datacenter
Datacenter
â˘âŻ Single switch & power
Member 1 â˘âŻ Points of failure:
â⯠Power
Member 2 â⯠Network
â⯠Data center
Datacenter 2
Member 3 â⯠Two node failure
â˘âŻ Automatic recovery of
single node crash
35. Replica Set â 2 Data Centers
â˘âŻ Multi data center
Datacenter 1
â˘âŻ DR node for safety
Member 1
â˘âŻ Canât do multi data
Member 2 center durable write
safely since only 1 node
in distant DC
Datacenter 2
Member 3
36. Replica Set â 3 Data Centers
Datacenter 1 â˘âŻ Three data centers
Member 1
Member 2
â˘âŻ Can survive full data
center loss
Datacenter 2
Member 3 â˘âŻ Can do w= { dc : 2 } to
Member 4 guarantee write in 2
data centers (with tags)
Datacenter 3
Member 5
38. Implementation details
â˘âŻ Heartbeat every 2 seconds
â⯠Times out in 10 seconds
â˘âŻ Local DB (not replicated)
â⯠system.replset
â⯠oplog.rs
â˘âŻ Capped collection
â˘âŻ Idempotent version of operation stored
41. Recent improvements
â˘âŻ Read preference support with sharding
â⯠Drivers too
â˘âŻ Improved replication over WAN/high-latency
networks
â˘âŻ rs.syncFrom command
â˘âŻ buildIndexes setting
â˘âŻ replIndexPrefetch setting
42. Just Use It
â˘âŻ Use replica sets
â˘âŻ Easy to setup
â⯠Try on a single machine
â˘âŻ Check doc page for RS tutorials
â⯠http://docs.mongodb.org/manual/replication/#tutorials