SlideShare ist ein Scribd-Unternehmen logo
1 von 27
#MongoSV 2012




Schema Design
-- Inboxes!
Jared Rosoff
Technical Director, 10gen
@forjared
Agenda
• Problem overview
• Design Options
  – Fan out on Read
  – Fan out on Write
  – Fan out on Write with Bucketing

• Conclusions




                         Single Table En
Problem Overview
Let’s get
Social
Sending Messages



               ?
Reading my Inbox



                   ?
Design Options
3 Approaches (there are
more)
• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
Fan out on read
• Generally, not the right approach
• 1 document per message sent
• Multiple recipients in an array key
• Reading an inbox is finding all messages with
 my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find
 everything
Fan out on Read
// Shard on “from”
db.shardCollection(”myapp.messages”, { ”from”: 1} )

// Make sure we have an index to handle inbox reads
db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
db.messages.save(msg)

// Read my inbox
db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
Fan out on read – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on read – Inbox Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write
• Tends to scale better than fan out on read
• 1 document per recipient
• Reading my inbox is just finding all of the
 messages with me as the recipient
• Can shard on recipient, so inbox reads hit one
 shard
• But still lots of random IO on the shard
Fan out on Write
// Shard on “recipient” and “sent”
db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } )

msg = {
   from: "Joe”,
   to: [ ”Bob”, “Jane” ],
   sent: new Date(),
   message: ”Hi!”,
}

// Send a message
for( recipient in msg.to ) {
     msg.recipient = recipient
     db.messages.save(msg);
}

// Read my inbox
db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
Fan out on write – Send
Message
             Send
            Message




  Shard 1             Shard 2   Shard 3
Fan out on write– Read Inbox
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Fan out on write with
bucketing
• Generally the best approach
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inbox documents so there’s not too many
 per document
• Can shard on recipient, so inbox reads hit one
 shard
• 1 or 2 documents to read the whole inbox
Fan out on Write
// Shard on “owner / sequence”
db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } )
db.shardCollection(”myapp.users”, { ”user_name”: 1 } )
msg = {
     from: "Joe”,
     to: [ ”Bob”, “Jane” ],
     sent: new Date(),
     message: ”Hi!”,
}
// Send a message
for( recipient in msg.to) {
     sequence = db.users.findAndModify({
           query: { user_name: recipient},
           update: { '$inc': { ‟msg_count': 1 }},
           upsert: true,
           new: true }).msg_count / 50
     db.inbox.update({ owner: recipient, sequence: sequence},
                        { $push: { „messages‟: msg } },
                        { upsert: true });
}
// Read my inbox
db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
Bucketed fan out on write -
Send
             Send
            Message




  Shard 1             Shard 2   Shard 3
Bucketed fan out on write -
Read
            Read
            Inbox




  Shard 1           Shard 2   Shard 3
Discussion
Tradeoffs
                 Fan out on              Fan out on          Bucketed Fan out
                   Read                    Write                 on Write
Send Message   Best                   Good                  Worst
Performance    Single shard           Shard per recipient   Shard per recipient
               Single write           Multiple writes       Appends (grows)
Read Inbox     Worst                  Good                  Best
Performance    Broadcast all shards   Single shard          Single shard
               Random reads           Random reads          Single read
Data Size      Best                   Worst                 Worst
               Message stored         Copy per recipient    Copy per recipient
               once
Things to consider
•   Lots of recipients
     •   Fan out on write might become prohibitive
     •   Consider introducing a “Group”

•   Very large message size
     •   Multiple copies of messages can be a burden
     •   Consider single copy of message with a “pointer” per inbox

•   More writes than reads
     •   Fan out on read might be okay
Comments – where do they
live?
Conclusion
Summary
• Multiple ways to model status updates
• Bucketed fan out on write is typically the better
 approach
• Think about how your model distributes across
 shards
• Think about how much random IO needs to
 happen on a shard
#MongoSV




Thank You
Jared Rosoff
Technical Director, 10gen

Weitere ähnliche Inhalte

Was ist angesagt?

Managed Feature Store for Machine Learning
Managed Feature Store for Machine LearningManaged Feature Store for Machine Learning
Managed Feature Store for Machine LearningLogical Clocks
 
Microservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaMicroservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaAraf Karsh Hamid
 
CI-Jenkins.pptx
CI-Jenkins.pptxCI-Jenkins.pptx
CI-Jenkins.pptxMEDOBEST1
 
Demystifying observability
Demystifying observability Demystifying observability
Demystifying observability Abigail Bangser
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven DesignYoung-Ho Cho
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - ObservabilityAraf Karsh Hamid
 
Inception workshop - Kickstarting an Agile project in style
Inception workshop - Kickstarting an Agile project in styleInception workshop - Kickstarting an Agile project in style
Inception workshop - Kickstarting an Agile project in styleJenny Wong
 
Linking Metrics to Logs using Loki
Linking Metrics to Logs using LokiLinking Metrics to Logs using Loki
Linking Metrics to Logs using LokiKnoldus Inc.
 
Kakao meets jira
Kakao meets jiraKakao meets jira
Kakao meets jira호정 이
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDBNick Court
 
An Introduction To Jenkins
An Introduction To JenkinsAn Introduction To Jenkins
An Introduction To JenkinsKnoldus Inc.
 
Scrum Сhecklist (Russian)
Scrum Сhecklist (Russian)Scrum Сhecklist (Russian)
Scrum Сhecklist (Russian)Artem Glazkov
 
Chaos Engineering for Docker
Chaos Engineering for DockerChaos Engineering for Docker
Chaos Engineering for DockerAlexei Ledenev
 
Scaled Agile Framework (SAFe) Roles and Meetings
Scaled Agile Framework (SAFe) Roles and MeetingsScaled Agile Framework (SAFe) Roles and Meetings
Scaled Agile Framework (SAFe) Roles and MeetingsRob Betcher
 
이벤트 기반 분산 시스템을 향한 여정
이벤트 기반 분산 시스템을 향한 여정이벤트 기반 분산 시스템을 향한 여정
이벤트 기반 분산 시스템을 향한 여정Arawn Park
 

Was ist angesagt? (20)

Managed Feature Store for Machine Learning
Managed Feature Store for Machine LearningManaged Feature Store for Machine Learning
Managed Feature Store for Machine Learning
 
Microservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaMicroservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and Saga
 
CI-Jenkins.pptx
CI-Jenkins.pptxCI-Jenkins.pptx
CI-Jenkins.pptx
 
Demystifying observability
Demystifying observability Demystifying observability
Demystifying observability
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Design
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
 
Inception workshop - Kickstarting an Agile project in style
Inception workshop - Kickstarting an Agile project in styleInception workshop - Kickstarting an Agile project in style
Inception workshop - Kickstarting an Agile project in style
 
Linking Metrics to Logs using Loki
Linking Metrics to Logs using LokiLinking Metrics to Logs using Loki
Linking Metrics to Logs using Loki
 
Kakao meets jira
Kakao meets jiraKakao meets jira
Kakao meets jira
 
Kafka vs kinesis
Kafka vs kinesisKafka vs kinesis
Kafka vs kinesis
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
 
DevSecOps: What Why and How : Blackhat 2019
DevSecOps: What Why and How : Blackhat 2019DevSecOps: What Why and How : Blackhat 2019
DevSecOps: What Why and How : Blackhat 2019
 
An Introduction To Jenkins
An Introduction To JenkinsAn Introduction To Jenkins
An Introduction To Jenkins
 
Scrum Сhecklist (Russian)
Scrum Сhecklist (Russian)Scrum Сhecklist (Russian)
Scrum Сhecklist (Russian)
 
Chaos Engineering for Docker
Chaos Engineering for DockerChaos Engineering for Docker
Chaos Engineering for Docker
 
Scaled Agile Framework (SAFe) Roles and Meetings
Scaled Agile Framework (SAFe) Roles and MeetingsScaled Agile Framework (SAFe) Roles and Meetings
Scaled Agile Framework (SAFe) Roles and Meetings
 
이벤트 기반 분산 시스템을 향한 여정
이벤트 기반 분산 시스템을 향한 여정이벤트 기반 분산 시스템을 향한 여정
이벤트 기반 분산 시스템을 향한 여정
 

Andere mochten auch

MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMike Friedman
 
Building a Social Network with MongoDB
  Building a Social Network with MongoDB  Building a Social Network with MongoDB
Building a Social Network with MongoDBFred Chu
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema DesignMongoDB
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBMongoDB
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesHadi Ariawan
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationMongoDB
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of MicroservicesMongoDB
 
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.jsThe MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.jsMongoDB
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
Web performance meetup bos 11 18-2010
Web performance meetup bos 11 18-2010Web performance meetup bos 11 18-2010
Web performance meetup bos 11 18-2010Jared Rosoff
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsMongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Designaaronheckmann
 
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)Jetlore
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Schema Design at Scale
Schema Design at ScaleSchema Design at Scale
Schema Design at ScaleRick Copeland
 
An afternoon with mongo db new delhi
An afternoon with mongo db new delhiAn afternoon with mongo db new delhi
An afternoon with mongo db new delhiRajnish Verma
 

Andere mochten auch (20)

MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
Building a Social Network with MongoDB
  Building a Social Network with MongoDB  Building a Social Network with MongoDB
Building a Social Network with MongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
The Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDBThe Right (and Wrong) Use Cases for MongoDB
The Right (and Wrong) Use Cases for MongoDB
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of Microservices
 
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.jsThe MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Web performance meetup bos 11 18-2010
Web performance meetup bos 11 18-2010Web performance meetup bos 11 18-2010
Web performance meetup bos 11 18-2010
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documents
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)
MongoDB Schema Design: Insights and Tradeoffs (Jetlore's talk at MongoSF 2012)
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Schema Design at Scale
Schema Design at ScaleSchema Design at Scale
Schema Design at Scale
 
An afternoon with mongo db new delhi
An afternoon with mongo db new delhiAn afternoon with mongo db new delhi
An afternoon with mongo db new delhi
 

Ähnlich wie MongoDB Advanced Schema Design - Inboxes

Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep DiveMongoDB
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesLewis Lin 🦊
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real WorldMike Friedman
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real WorldMongoDB
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldMongoDB
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB
 

Ähnlich wie MongoDB Advanced Schema Design - Inboxes (8)

Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep Dive
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real World
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real World
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real World
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
 

Mehr von Jared Rosoff

Mongosv 2011 - Sharding
Mongosv 2011 - ShardingMongosv 2011 - Sharding
Mongosv 2011 - ShardingJared Rosoff
 
Mongosv 2011 - Replication
Mongosv 2011 - ReplicationMongosv 2011 - Replication
Mongosv 2011 - ReplicationJared Rosoff
 
Mongosv 2011 - MongoDB on Amazon EC2
Mongosv 2011 - MongoDB on Amazon EC2Mongosv 2011 - MongoDB on Amazon EC2
Mongosv 2011 - MongoDB on Amazon EC2Jared Rosoff
 
MongoDB Deployment Tips
MongoDB Deployment TipsMongoDB Deployment Tips
MongoDB Deployment TipsJared Rosoff
 
Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Jared Rosoff
 
MongoDB on EC2 and EBS
MongoDB on EC2 and EBSMongoDB on EC2 and EBS
MongoDB on EC2 and EBSJared Rosoff
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimizationJared Rosoff
 
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on RailsScalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on RailsJared Rosoff
 

Mehr von Jared Rosoff (8)

Mongosv 2011 - Sharding
Mongosv 2011 - ShardingMongosv 2011 - Sharding
Mongosv 2011 - Sharding
 
Mongosv 2011 - Replication
Mongosv 2011 - ReplicationMongosv 2011 - Replication
Mongosv 2011 - Replication
 
Mongosv 2011 - MongoDB on Amazon EC2
Mongosv 2011 - MongoDB on Amazon EC2Mongosv 2011 - MongoDB on Amazon EC2
Mongosv 2011 - MongoDB on Amazon EC2
 
MongoDB Deployment Tips
MongoDB Deployment TipsMongoDB Deployment Tips
MongoDB Deployment Tips
 
Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011Scaling with mongo db - SF Mongo User Group 7-19-2011
Scaling with mongo db - SF Mongo User Group 7-19-2011
 
MongoDB on EC2 and EBS
MongoDB on EC2 and EBSMongoDB on EC2 and EBS
MongoDB on EC2 and EBS
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimization
 
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on RailsScalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
 

Kürzlich hochgeladen

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Kürzlich hochgeladen (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

MongoDB Advanced Schema Design - Inboxes

  • 1. #MongoSV 2012 Schema Design -- Inboxes! Jared Rosoff Technical Director, 10gen @forjared
  • 2. Agenda • Problem overview • Design Options – Fan out on Read – Fan out on Write – Fan out on Write with Bucketing • Conclusions Single Table En
  • 8. 3 Approaches (there are more) • Fan out on Read • Fan out on Write • Fan out on Write with Bucketing
  • 9. Fan out on read • Generally, not the right approach • 1 document per message sent • Multiple recipients in an array key • Reading an inbox is finding all messages with my own name in the recipient field • Requires scatter-gather on sharded cluster • Then a lot of random IO on a shard to find everything
  • 10. Fan out on Read // Shard on “from” db.shardCollection(”myapp.messages”, { ”from”: 1} ) // Make sure we have an index to handle inbox reads db.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message db.messages.save(msg) // Read my inbox db.messages.find({ to: ”Joe” }).sort({ sent: -1 })
  • 11. Fan out on read – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 12. Fan out on read – Inbox Read Read Inbox Shard 1 Shard 2 Shard 3
  • 13. Fan out on write • Tends to scale better than fan out on read • 1 document per recipient • Reading my inbox is just finding all of the messages with me as the recipient • Can shard on recipient, so inbox reads hit one shard • But still lots of random IO on the shard
  • 14. Fan out on Write // Shard on “recipient” and “sent” db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to ) { msg.recipient = recipient db.messages.save(msg); } // Read my inbox db.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
  • 15. Fan out on write – Send Message Send Message Shard 1 Shard 2 Shard 3
  • 16. Fan out on write– Read Inbox Read Inbox Shard 1 Shard 2 Shard 3
  • 17. Fan out on write with bucketing • Generally the best approach • Each “inbox” document is an array of messages • Append a message onto “inbox” of recipient • Bucket inbox documents so there’s not too many per document • Can shard on recipient, so inbox reads hit one shard • 1 or 2 documents to read the whole inbox
  • 18. Fan out on Write // Shard on “owner / sequence” db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } ) db.shardCollection(”myapp.users”, { ”user_name”: 1 } ) msg = { from: "Joe”, to: [ ”Bob”, “Jane” ], sent: new Date(), message: ”Hi!”, } // Send a message for( recipient in msg.to) { sequence = db.users.findAndModify({ query: { user_name: recipient}, update: { '$inc': { ‟msg_count': 1 }}, upsert: true, new: true }).msg_count / 50 db.inbox.update({ owner: recipient, sequence: sequence}, { $push: { „messages‟: msg } }, { upsert: true }); } // Read my inbox db.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
  • 19. Bucketed fan out on write - Send Send Message Shard 1 Shard 2 Shard 3
  • 20. Bucketed fan out on write - Read Read Inbox Shard 1 Shard 2 Shard 3
  • 22. Tradeoffs Fan out on Fan out on Bucketed Fan out Read Write on Write Send Message Best Good Worst Performance Single shard Shard per recipient Shard per recipient Single write Multiple writes Appends (grows) Read Inbox Worst Good Best Performance Broadcast all shards Single shard Single shard Random reads Random reads Single read Data Size Best Worst Worst Message stored Copy per recipient Copy per recipient once
  • 23. Things to consider • Lots of recipients • Fan out on write might become prohibitive • Consider introducing a “Group” • Very large message size • Multiple copies of messages can be a burden • Consider single copy of message with a “pointer” per inbox • More writes than reads • Fan out on read might be okay
  • 24. Comments – where do they live?
  • 26. Summary • Multiple ways to model status updates • Bucketed fan out on write is typically the better approach • Think about how your model distributes across shards • Think about how much random IO needs to happen on a shard