One of the strongest points for using a NoSQL database is their focus on distribution — both for replication and sharding. This talks takes a short look at what replication is, why you should use it, and what is so difficult about it. We then take a look at MongoDB’s implementation in general and finally focus on what can go wrong. In a practical demo you see how to find the right balance between performance versus data safety and how to use it in your Java application.
16. Configure replication
Start on the old instance, otherwise data lost
rs.initiate()
rs.status()
rs.add("PK-MBP:27002")
rs.add("PK-MBP:27003")
rs.status()
db.isMaster()
db.test.find()
db.test.insert({ name: "Peter", city: "Steyr" })
db.test.find()
17. Read from secondaries
$ mongo --port 27002
> db.test.find()
> rs.slaveOk()
> db.test.find()
> db.test.insert({ name: "Dieter", city: "Graz" })
slaveOk only valid for the current connection
18. Failover
Kill primary with [Ctrl]+[C]
Write to new primary
> rs.status()
> db.test.insert({ name: "Dieter", city: "Graz" })
> db.test.find()
32. CAP
Select Availability or Consistency
Partition-tolerance is a prerequisite
for distributed systems
"The network is reliable":
http://aphyr.com/posts/288-the-network-is-reliable
33. Rollback
Old primary rolls back unreplicated
changes once it rejoins the replica set
38. Multiple primaries
Unlikely but possible
Bugs: https://jira.mongodb.org/browse/SERVER-9765
Test script with no replies: https://groups.google.com/
forum/#!topic/mongodb-dev/-mH6BOYyzeI
50. Write concern performance
https://blog.serverdensity.com/mongodb-on-google-compute-
engine-tips-and-benchmarks/
3 x 1,000 inserts on GCE
Local 10GB system disk
Dedicated 200GB disk
Dedicated 200GB for data and journal