SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
#MongoSV 2012




Schema Design
-- Inboxes!
Jared Rosoff
Technical Director, 10gen
@forjared
Agenda
• Problem overview
• Design Options
  – Fan out on Read
  – Fan out on Write
  – Fan out on Write with Bucketing

• Conclusions




                         Single Table En
Problem Overview
Let’s get
Social
Sending Messages



               ?
Reading my Inbox



                   ?
Design Options
3 Approaches (there are
more)
• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
Fan out on read
• Generally, not the right approach
• 1 document per message sent
• Multiple recipients in an array key
• Reading an inbox is finding all messages with
 my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find
 everything
Fan out on Read
// Shard on “from”
db.shardCollection(”myapp.messages”, { ”from”: 1} )

// Make sure we have an index to handle inbox reads
db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
db.messages.save(msg)

// Read my inbox
db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
Fan out on read – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on read – Inbox Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write
• Tends to scale better than fan out on read
• 1 document per recipient
• Reading my inbox is just finding all of the
 messages with me as the recipient
• Can shard on recipient, so inbox reads hit one
 shard
• But still lots of random IO on the shard
Fan out on Write
// Shard on “recipient” and “sent”
db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
for( recipient in msg.to ) {
     msg.recipient = recipient
     db.messages.save(msg);
}

// Read my inbox
db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
Fan out on write – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on write– Read Inbox
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write with
bucketing
• Generally the best approach
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inbox documents so there’s not too many
 per document
• Can shard on recipient, so inbox reads hit one
 shard
• 1 or 2 documents to read the whole inbox
Fan out on Write
// Shard on “owner / sequence”
db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } )
db.shardCollection(”myapp.users”, { ”user_name”: 1 } )
msg = {
     from: "Joe”,
     to: [ ”Bob”, “Jane” ],
     sent: new Date(),
     message: ”Hi!”,
}
// Send a message
for( recipient in msg.to) {
     sequence = db.users.findAndModify({
           query: { user_name: recipient},
           update: { '$inc': { ‟msg_count': 1 }},
           upsert: true,
           new: true }).msg_count / 50
     db.inbox.update({ owner: recipient, sequence: sequence},
                        { $push: { „messages‟: msg } },
                        { upsert: true });
}
// Read my inbox
db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
Bucketed fan out on write -
Send
             Send
            Message




  Shard 1             Shard 2   Shard 3
Bucketed fan out on write -
Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Discussion
Tradeoffs
                 Fan out on              Fan out on          Bucketed Fan out
                   Read                    Write                 on Write
Send Message   Best                   Good                  Worst
Performance    Single shard           Shard per recipient   Shard per recipient
               Single write           Multiple writes       Appends (grows)
Read Inbox     Worst                  Good                  Best
Performance    Broadcast all shards   Single shard          Single shard
               Random reads           Random reads          Single read
Data Size      Best                   Worst                 Worst
               Message stored         Copy per recipient    Copy per recipient
               once
Things to consider
•   Lots of recipients
     •   Fan out on write might become prohibitive
     •   Consider introducing a “Group”

•   Very large message size
     •   Multiple copies of messages can be a burden
     •   Consider single copy of message with a “pointer” per inbox

•   More writes than reads
     •   Fan out on read might be okay
Comments – where do they
live?
Conclusion
Summary
• Multiple ways to model status updates
• Bucketed fan out on write is typically the better
 approach
• Think about how your model distributes across
 shards
• Think about how much random IO needs to
 happen on a shard
#MongoSV




Thank You
Jared Rosoff
Technical Director, 10gen

Weitere ähnliche Inhalte

Mehr von Jeremy Taylor

TCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleTCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleJeremy Taylor
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDBJeremy Taylor
 
Strategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyStrategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyJeremy Taylor
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows AzureJeremy Taylor
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows AzureJeremy Taylor
 
How Apollo Group Evaluted MongoDB
How Apollo Group Evaluted MongoDBHow Apollo Group Evaluted MongoDB
How Apollo Group Evaluted MongoDBJeremy Taylor
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb IntroductionJeremy Taylor
 

Mehr von Jeremy Taylor (8)

TCO - MongoDB vs. Oracle
TCO - MongoDB vs. OracleTCO - MongoDB vs. Oracle
TCO - MongoDB vs. Oracle
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
 
Strategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyStrategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 Copy
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
How Apollo Group Evaluted MongoDB
How Apollo Group Evaluted MongoDBHow Apollo Group Evaluted MongoDB
How Apollo Group Evaluted MongoDB
 
AWS & MongoDB
AWS & MongoDBAWS & MongoDB
AWS & MongoDB
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 

MongoDB Schema Design -- Inboxes

  • 1. #MongoSV 2012 Schema Design -- Inboxes! Jared Rosoff Technical Director, 10gen @forjared
  • 2. Agenda • Problem overview • Design Options – Fan out on Read – Fan out on Write – Fan out on Write with Bucketing • Conclusions Single Table En
  • 8. 3 Approaches (there are more) • Fan out on Read • Fan out on Write • Fan out on Write with Bucketing
  • 9. Fan out on read • Generally, not the right approach • 1 document per message sent • Multiple recipients in an array key • Reading an inbox is finding all messages with my own name in the recipient field • Requires scatter-gather on sharded cluster • Then a lot of random IO on a shard to find everything
  • 10. Fan out on Read // Shard on “from” db.shardCollection(”myapp.messages”, { ”from”: 1} ) // Make sure we have an index to handle inbox reads db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message db.messages.save(msg) // Read my inbox db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
  • 11. Fan out on read – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 12. Fan out on read – Inbox Read Read Inbox Shard 1 Shard 2 Shard 3
  • 13. Fan out on write • Tends to scale better than fan out on read • 1 document per recipient • Reading my inbox is just finding all of the messages with me as the recipient • Can shard on recipient, so inbox reads hit one shard • But still lots of random IO on the shard
  • 14. Fan out on Write // Shard on “recipient” and “sent” db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to ) { msg.recipient = recipient db.messages.save(msg); } // Read my inbox db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
  • 15. Fan out on write – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 16. Fan out on write– Read Inbox Read Inbox Shard 1 Shard 2 Shard 3
  • 17. Fan out on write with bucketing • Generally the best approach • Each “inbox” document is an array of messages • Append a message onto “inbox” of recipient • Bucket inbox documents so there’s not too many per document • Can shard on recipient, so inbox reads hit one shard • 1 or 2 documents to read the whole inbox
  • 18. Fan out on Write // Shard on “owner / sequence” db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } ) db.shardCollection(”myapp.users”, { ”user_name”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to) { sequence = db.users.findAndModify({ query: { user_name: recipient}, update: { '$inc': { ‟msg_count': 1 }}, upsert: true, new: true }).msg_count / 50 db.inbox.update({ owner: recipient, sequence: sequence}, { $push: { „messages‟: msg } }, { upsert: true }); } // Read my inbox db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
  • 19. Bucketed fan out on write - Send Send Message Shard 1 Shard 2 Shard 3
  • 20. Bucketed fan out on write - Read Read Inbox Shard 1 Shard 2 Shard 3
  • 22. Tradeoffs Fan out on Fan out on Bucketed Fan out Read Write on Write Send Message Best Good Worst Performance Single shard Shard per recipient Shard per recipient Single write Multiple writes Appends (grows) Read Inbox Worst Good Best Performance Broadcast all shards Single shard Single shard Random reads Random reads Single read Data Size Best Worst Worst Message stored Copy per recipient Copy per recipient once
  • 23. Things to consider • Lots of recipients • Fan out on write might become prohibitive • Consider introducing a “Group” • Very large message size • Multiple copies of messages can be a burden • Consider single copy of message with a “pointer” per inbox • More writes than reads • Fan out on read might be okay
  • 24. Comments – where do they live?
  • 26. Summary • Multiple ways to model status updates • Bucketed fan out on write is typically the better approach • Think about how your model distributes across shards • Think about how much random IO needs to happen on a shard