2. • Real-time retail intelligence
• Gather products and prices from web
• MongoDB in production
• Millions of updates per day, 3K/s peak
• Data in SQL, Mongo, ElasticSearch
3. Concurrency Patterns: Why?
• MongoDB: atomic updates, no transactions
• Need to ensure consistency & correctness
• What are my options with Mongo?
• Shortcuts
• Different approaches
4. Concurrency Control Strategies
• Pessimistic
• Suited for frequent conflicts
• http://bit.ly/two-phase-commits
• Optimistic
• Efficient when conflicts are rare
• http://bit.ly/isolate-sequence
• Multi-version
• All versions stored, client resolves conflict
• e.g. CouchDb
5. Optimistic Concurrency Control (OCC)
• No locks
• Prevent dirty writes
• Uses timestamp or a revision number
• Client checks & replays transaction
6. Example
Original
{ _id: 23, comment: “The quick brown fox…” }
Edit 1
{ _id: 23,
comment: “The quick brown fox prefers SQL” }
Edit 2
{ _id: 23,
comment: “The quick brown fox prefers
MongoDB” }
7. Example
Edit 1
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers SQL” })
Edit 2
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers MongoDB”
})
Outcome: One update is lost, other might be wrong
8. OCC Example
Original
{ _id: 23, rev: 1,
comment: “The quick brown fox…” }
Update a specific revision (edit 1)
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers SQL”
})
9. OCC Example
Edit 2
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers
MongoDB” })
..fails
{ updatedExisting: false, n: 0,
err: null, ok: 1 }
• Caveat: Only works if all clients follow convention
10. Update Operators in Mongo
• Avoid full document replacement by using operators
• Powerful operators such as $inc, $set, $push
• Many operators can be grouped into single atomic update
• More efficient (data over wire, parsing, etc.)
• Use as much as possible
• http://bit.ly/update-operators
11. Still Need OCC?
A hit counter
{ _id: 1, hits: 5040 }
Edit 1
db.stats.update({ _id: 1 },
{ $set: { hits: 5045 } })
Edit 2
db.stats.update({ _id: 1 },
{ $set: { hits: 5055 } })
12. Still Need OCC?
Edit 1
db.stats.update({ _id: 1 },
{ $inc: { hits: 5 } })
Edit 2
db.stats.update({ _id: 1 },
{ $inc: { hits: 10 } })
• Sequence of updates might vary
• Outcome always the same
• But what if sequence is important?
13. Still Need OCC?
• Operators can offset need for concurrency control
• Support for complex atomic manipulation
• Depends on use case
• You’ll need it for
• Opaque changes (e.g. text)
• Complex update logic in app domain
(e.g. changing a value affects some calculated fields)
• Sequence is important and can’t be inferred
14. Update Commands
• Update
• Specify query to match one or more documents
• Use { multi: true } to update multiple documents
• Must call Find() separately if you want a copy of the doc
• FindAndModify
• Update single document only
• Find + Update in single hit (atomic)
• Returns the doc before or after update
• Whole doc or subset
• Upsert (update or insert)
• Important feature. Works with OCC..?
15. Consistent Update Example
• Have a customer document
• Want to set the LastOrderValue and return the previous
value
db.customers.findAndModify({
query: { _id: 16, rev: 45 },
update: {
$set: { lastOrderValue: 699 },
$inc: { rev: 1 }
},
new: false
})
16. Consistent Update Example
• Customer has since been updated, or doesn’t exist
• Client should replay
null
• Intended version of customer successfully updated
• Original version is returned
{ _id: 16, rev: 45, lastOrderValue: 145 }
• Useful if client has got partial information and needs the
full document
• A separate Find() could introduce inconsistency
17. Independent Update with Upsert
• Keep stats about customers
• Want to increment NumOrders and return new total
• Customer document might not be there
• Independent operation still needs protection
db.customerStats.findAndModify({
query: { _id: 16 },
update: {
$inc: { numOrders: 1, rev: 1 },
$setOnInsert: { name: “Yann” }
},
new: true,
upsert: true
})
18. Independent Update with Upsert
• First run, document is created
{ _id: 16, numOrders: 1, rev: 1, name: “Yann” }
• Second run, document is updated
{ _id: 16, numOrders: 2, rev: 2, name: “Yann” }
19. Subdocuments
• Common scenario
• e.g. Customer and Orders in single document
• Clients like having everything
• Powerful operators for matching and updating
subdocuments
• $elemMatch, $, $addToSet, $push
• Alternatives to “Fat” documents;
• Client-side joins
• Aggregation
• MapReduce
20. Currency Control and Subdocuments
• Patterns described here still work, but might be
impractical
• Docs are large
• More collisions
• Solve with scale?
21. Subdocument Example
• Customer document contains orders
• Want to independently update orders
• Correct order #471 value to £260
{
_id: 16,
rev: 20,
name: “Yann”,
orders: {
“471”: { id: 471, value: 250, rev: 4 }
}
}
No transactions.Different databases have different features.Cheating is fun. How can we avoid problems entirely.Efficiency is key. Understand what’s achievable in a single update
All doable in MongoDBOptimistic usually the best option for typical mongodb projectHow to roll yo