How to Keep Your Data Safe in MongoDB

CTO, 10gen
Eliot Horowitz
#MongoDBDays
How to keep your data safe
in MongoDB

What can go wrong?
• Network breaks in transit
• Server crashes while processing
• Server blows up after processing a write before
replication
• Server processes, crashes, and then a
conflicting write happens elsewhere
• All copies burn in a fire
• 20 years later, no one remembers how to read it

Version 1
Probability that a given
write or piece of data is
accessible given human
intervention and infinite
time.

Version 2
Probability that a given
write or piece of data is
visible in a query.

Single Server – How a Write
Works
• Client sends a write operation to server
• Received by server’s tcp stack
• MongoDB process queues write
• Write happens in memory
• Depending on what Write Concern asks for
– Respond immediately
– Wait for data to be journaled, then respond

Single Server – What can go
wrong
• Network can go down once message hits other side
• Client doesn’t know what happens without going back and checking
• Write could fail for logical reason (unique key exception)
• Server could crash before journaled
• Write is lost journaled
• Server could crash after journaled
• When server is recovered, write is replayed and safe
• Hard drive can crash irrecoverably
• Data center could lose power for large period of time

Any single server will fail
Replica Sets

Replica Set - Reminders
• N nodes
• Each node has a fully copy of the data
• Replication is asynchronous

Replica Set -
Acknowledgements
• “w” : how many servers must apply write before
acknowledged
• w=2 : do not acknowledge until write is on two
servers
– If primary fails, election guaranteesnew primary has all writes
acknowledge w=2
• w=majority : do not acknowledge until writes is on a
majority of nodes in a replica set
– If any primaryis elected automatically,all writes acknowledged with
w=majoritywill be on primary.

Good, but not enough…
What if I lose an entire
data center?

Replica Set - tags
• A node can have a set of tags
– region=us-east
– color=blue
• Operator configures write level
– Critical– has to be in 3 regions
– Important – has to be in 2 regions
• w=critical
– Do not acknowledge write until its in 3 data centers
– Losing an entire data center causes no data loss

What about sharding?
• Same rules apply
• Given a series of writes, they may go to different
shards
– Aw=majority at the end means all writes on that socket
are acknowledge by a majority of the relevant replica set
• Config servers have no impact on fault
tolerance/durability, only on admin uptime (or
real uptime in a disaster)

Personal Blog
• Single server
• No replication
• Hourly backups
• If server crashes
– Down until back up
– All acknowledge writes safe
• If server is destroyed
– Have to recover from backup
– Lose up to 1 hour of writes

Departmental App
• Single replica set
• 3 nodes in a single server
• If any single node goes down
– System is still readable/writeable
– Writes done with w=2 are safe
• If 2 nodes go down at the same time
– Only writes with w=3 are safe (bad idea)
– No primary, last node is read-only

Core User Database
• Single replica set
• 3 data centers
– Primary data center: 3 node (p=2)
– 2 alternates with 2 nodes each (p=1)
• Different types of operations
– Password change (w=majority)
– Adds a “like” (w=2)
– Login count (w=1)

Core User Database – cont’d
• Lose any single server
– Can only lose a login count
• Lose any 2 servers
– Could lose a “like” if you are unlucky
• Lose a data center
– Still have a majority
– All password changes are safe

Choice is a double
edged sword

When to give a choice?
• Give choice over semantics
– Developers and Operators know their needs
• Tuning parameters are dangerous
– System should be smart enough to avoid thousands of
knobs
• Defaults should be
– Intuitive and sensible
– Changing is hard
– Always changing a little

Already have them in different
architecture components
• Caching
• Worker queues
• Asynchronous replication
• Synchronous replication
• Two-phase commit

MongoDB gives you the choice of
durability semantics from many
systems in one.
• Control per write
• One source of truth in architecture

What should you do?
• Pick a default write level for your app
• Only deviate with good reason
• Test disaster scenario so you know what’s
going to happen

CTO, 10gen
Eliot Horowitz
#MongoDBDays
Thank You

How to Keep Your Data Safe in MongoDB

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Ähnlich wie How to Keep Your Data Safe in MongoDB

Ähnlich wie How to Keep Your Data Safe in MongoDB (20)

Mehr von MongoDB

Mehr von MongoDB (20)

How to Keep Your Data Safe in MongoDB

Hinweis der Redaktion