SlideShare a Scribd company logo
1 of 41
#MDBW17
Daniel Coupal, Senior Curriculum Engineer
ADVANCED SCHEMA DESIGN
PATTERNS
#MDBW17
WHO AM I?
{ "name": "Daniel Coupal",
"jobs_at_MongoDB": [
{ "job": "Senior Curriculum Engineer",
"from": new Date("2016-11") },
{ "job": "Senior Technical Service Engineer",
"from": new Date("2013-11") }
],
"previous_jobs": [
"Consultant",
"Developer",
"Manager Quality & Tools Team",
"Manager Software Team",
"Tools Developer"
],
"likes": [ "food", "beers", "movies", "MongoDB"
]
}
#MDBW17
PATTERN
• The "Gang of Four":
A design pattern systematically
names, explains, and evaluates
an important and recurring
design in object-oriented
systems
• MongoDB systems can also be
built using its own patterns
#MDBW17
WHY THIS TALK?
1) Enable teams to use a common methodology and vocabulary
when designing schemas for MongoDB
2) Giving you the ability to model schemas using building blocks
3) Less art and more methodology
#MDBW17
WMDB -
WORLD MOVIE DATABASE
Any events, characters and
entities depicted in this
presentation are fictional.
Any resemblance or similarity
to reality is entirely
coincidental
#MDBW17
WMDB -
WORLD MOVIE DATABASE
First iteration
3 collections:
A. movies
B. moviegoers
C.screenings
#MDBW17
MISSION
POSSIBLEOur mission, should we
decide to accept it, is to fix
this solution, so it can
perform well and scale.
As always, should I or
anyone in the audience do it
without training, WMDB will
disavow any knowledge of
our actions.
This tape will self-destruct
in five seconds. Good luck!
#MDBW17
WHY WE CREATE MODELS
Ensure:
• Good performance
• Scalability
despite a set of constraints ➡
• Hardware
‒ RAM faster than Disk
‒ Disk cheaper than RAM
‒ Network latency
‒ Reduce costs $$$
• Database Server
‒ Maximum size for a document
‒ Atomicity of a write
• Data set
‒ Size of data
#MDBW17
HOWEVER …
• Don’t over-design! • Design for:
‒ Performance
‒ Scalability
‒ Simplicity
#MDBW17
CATEGORIES OF PATTERNS
• Representation
‒ Attribute ✓
‒ Tree
‒ Pre-Allocation
• Frequency of access
‒ Subset ✓
‒ Approximation ✓
• Grouping
‒ Computed ✓
‒ Overflow ✓
‒ Bucket
#MDBW17
ISSUE #1: TOO MANY OPTIONAL FIELDS
{
title: "Moonlight",
...
release_USA: "2016/09/02",
release_Mexico: "2017/01/27",
release_France: "2017/02/01",
release_Festival_Mill_Valley:
"2017/10/10"
}
Would need the following indexes:
{ release_USA: 1 }
{ release_Mexico: 1 }
{ release_France: 1 }
...
{ release_Festival_Mill_Valley: 1 }
...
#MDBW17
PATTERN #1: ATTRIBUTES
• Easy to index, for example:
{
"releases.location":1,
"releases.date":1
}
#MDBW17
PATTERN #1: ATTRIBUTES
Problem:
• Fields present in only a small subset of documents
• Lots of those fields
• Common characteristic to search across those fields together
Use cases:
• Product attributes like ‘color’, ‘size’, ‘dimensions’, ...
• Release dates of a movie in different countries, festivals
#MDBW17
SUMMARY: ATTRIBUTES
Solution:
• Field pairs in an array
• Easy to extend with a qualifier, for example:
‒ {descriptor: "price", qualifier: "euros", value: Decimal(100.00)}
Benefits:
• Allow for non deterministic list of attributes
• Easy to index
#MDBW17
ISSUE #2: WORKING SET DOESN’T FIT IN
RAM
Possible solutions:
A. Reduce the size of your working set
B. Add more RAM per machine
C. Start sharding or add more shards
#MDBW17
WHY CAN’T WE
HAVE MORE RAM?
Elon Musk is buying all the
metal for his colony on Mars
#MDBW17
PATTERN #2: SUBSET
In this example, we can:
• Limit the list of actors and
crew to 20
• Limit the embedded
reviews to the top 20
#MDBW17
PATTERN #2: SUBSET
Problem:
• There is a 1-N or N-N relationship, and only few documents from
need to be shown always
• Only infrequently do you need to pull all of the depending
documents
Use cases:
• Main actors of a movie
• List of reviews or comments
#MDBW17
SUMMARY: SUBSET
Solution:
• Keep duplicates of a small subset of fields in the main collection
Benefits:
• Allows for fast data retrieval and a reduced working set size
• One query brings all the information needed for the "main page"
#MDBW17
PATTERN ASPECT: CONSISTENCY
• How duplication is handled
A. Update both source and target in real time
B. Update target from source at regular intervals. Examples:
o Most popular items => update nightly
o Revenues from a movie => update every hour
o Last 10 reviews => update hourly? daily?
#MDBW17
ISSUE #3: REPEATED COMPUTATIONS
{
title: "Your Name",
...
viewings: 5,000
viewers: 385,000
revenues: 5,074,800
}
#MDBW17
PATTERN #3: COMPUTED
For example:
• Apply a sum, count, ...
• rollup data by minute, hour,
day
• As long as you don’t mess
with your source, you can
recreate the rollups
#MDBW17
PATTERN #3: COMPUTED
Problem:
• There is data that needs to be computed
• The same calculations would happen over and over
• Reads outnumber writes:
‒ example: 1K writes per hour vs 1M read per hour
Use cases:
• Have revenues per movie showing, want to display sums
• Time series data, Event Sourcing
#MDBW17
SUMMARY: COMPUTED
Solution:
• Apply a computation or operation on data and store the result
Benefits:
• Avoid re-computing the same thing over and over
• Replaces a view
#MDBW17
ISSUE #4: APPROXIMATE VALUES
#MDBW17
PATTERN #4: APPROXIMATION
• Only increment once in X
iterations
• Increment by X
#MDBW17
PATTERN #4: APPROXIMATION
Problem:
• Data is difficult to calculate correctly
• May be too expensive to update the document every time to keep
an exact count
• No one gives a damn if the number is exact
Use cases:
• Population of a country
• Web site visits
#MDBW17
SUMMARY: APPROXIMATION
Solution:
• Fewer stronger writes
Benefits:
• Less writes, reducing contention on some documents
#MDBW17
ISSUE #5: OUTLIERS DRIVING OUR
SOLUTION
• Trying to model for the worst case
#MDBW17
I WANT TO BE AN EXTRA!
• Not the best way
to be noticed
#MDBW17
PATTERN #5: OVERFLOW
Each group of extras is put
in a bucket of 1000.
If we fill a bucket, we create
a new one.
Also known as the "Justin
Bieber" pattern
#MDBW17
PATTERN #5: OVERFLOW
Problem:
• There is a 1-N relationship
• N can be embedded or referenced, except for few outliers
• The list of references may not even fit into an array
• You don’t want the outliers to drive your overall design
Use cases:
• Some very popular people with a huge list of followers
• Movie with a ton of actors
#MDBW17
SUMMARY: OVERFLOW
Solution:
• Have a field marking a document as an outlier
• Do different queries for the outliers
Benefits:
• The design is not driven by few outliers. However, you will need to
handle the outliers on the application side
#MDBW17
OTHER PATTERNS
• Bucket
• Pre-allocation
• Tree(s)
#MDBW17
BACK TO REALITY
#MDBW17
TAKE AWAYS
• Simple grouping from tables to collections is not optimal
• Learn a common vocabulary for designing schemas with MongoDB
• Use patterns as "plug-and-play" for your future designs
‒ Attribute
‒ Subset
‒ Computed
‒ Approximation
‒ Overflow
#MDBW17
REFERENCES FOR COMPLETE SOLUTIONS
A full design example for a given
problem:
• E-commerce site
• Contents Management System
• Social Networking
• Single view
• …
#MDBW17
HOW CAN I LEARN MORE ABOUT SCHEMA
DESIGN?
• More patterns in the published form of this presentation
• MongoDB in-person training courses on Schema Design
• Upcoming Online course at
MongoDB University:
‒ https://university.mongodb.com
‒ M220 Data Modeling
#MDBW17
THANK YOU FOR USING MONGODB!
daniel.coupal@mongodb.com
Advanced Schema Design Patterns for MongoDB

More Related Content

What's hot

Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkmxmxm
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerMongoDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language Weaveworks
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Prometheus: From technical metrics to business observability
Prometheus: From technical metrics to business observabilityPrometheus: From technical metrics to business observability
Prometheus: From technical metrics to business observabilityJulien Pivotto
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerWiredTiger
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 

What's hot (20)

Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
 
InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Prometheus: From technical metrics to business observability
Prometheus: From technical metrics to business observabilityPrometheus: From technical metrics to business observability
Prometheus: From technical metrics to business observability
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTiger
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
All about InfluxDB.
All about InfluxDB.All about InfluxDB.
All about InfluxDB.
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Similar to Advanced Schema Design Patterns for MongoDB

Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxMongoDB
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxMongoDB
 
Need to reboot your content creation strategy? Start with "No"
Need to reboot your content creation strategy? Start with "No"Need to reboot your content creation strategy? Start with "No"
Need to reboot your content creation strategy? Start with "No"Keith Boyd
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
MongoDB.local Dallas 2019: Advanced Schema Design Patterns
MongoDB.local Dallas 2019: Advanced Schema Design PatternsMongoDB.local Dallas 2019: Advanced Schema Design Patterns
MongoDB.local Dallas 2019: Advanced Schema Design PatternsMongoDB
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsMatthew Kalan
 
Data Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataData Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataMongoDB
 
The Path to Truly Understanding your MongoDB Data
The Path to Truly Understanding your MongoDB DataThe Path to Truly Understanding your MongoDB Data
The Path to Truly Understanding your MongoDB DataMongoDB
 
MongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design PatternsMongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design PatternsMongoDB
 
Designing Cloud Products
Designing Cloud Products Designing Cloud Products
Designing Cloud Products MongoDB
 
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema DesignMongoDB
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLInside Analysis
 
Expanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designExpanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designroskakori
 
SH 1 - SES 5 - SamW-TelAviv.pptx
SH 1 - SES 5 - SamW-TelAviv.pptxSH 1 - SES 5 - SamW-TelAviv.pptx
SH 1 - SES 5 - SamW-TelAviv.pptxMongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema DesignJoe Drumgoole
 
The Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataThe Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataMongoDB
 
Planning Your Web Build - The Blueprint for Digital Performance
Planning Your Web Build - The Blueprint for Digital PerformancePlanning Your Web Build - The Blueprint for Digital Performance
Planning Your Web Build - The Blueprint for Digital PerformanceGareth Cartman
 

Similar to Advanced Schema Design Patterns for MongoDB (20)

Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptx
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptx
 
Need to reboot your content creation strategy? Start with "No"
Need to reboot your content creation strategy? Start with "No"Need to reboot your content creation strategy? Start with "No"
Need to reboot your content creation strategy? Start with "No"
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
MongoDB.local Dallas 2019: Advanced Schema Design Patterns
MongoDB.local Dallas 2019: Advanced Schema Design PatternsMongoDB.local Dallas 2019: Advanced Schema Design Patterns
MongoDB.local Dallas 2019: Advanced Schema Design Patterns
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design Patterns
 
Data Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataData Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB Data
 
The Path to Truly Understanding your MongoDB Data
The Path to Truly Understanding your MongoDB DataThe Path to Truly Understanding your MongoDB Data
The Path to Truly Understanding your MongoDB Data
 
Discovery Phase: Planing Your Web Project
Discovery Phase: Planing Your Web ProjectDiscovery Phase: Planing Your Web Project
Discovery Phase: Planing Your Web Project
 
MongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design PatternsMongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design Patterns
 
Designing Cloud Products
Designing Cloud Products Designing Cloud Products
Designing Cloud Products
 
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
 
Framing the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQLFraming the Argument: How to Scale Faster with NoSQL
Framing the Argument: How to Scale Faster with NoSQL
 
Expanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on designExpanding skill sets - Broaden your perspective on design
Expanding skill sets - Broaden your perspective on design
 
SH 1 - SES 5 - SamW-TelAviv.pptx
SH 1 - SES 5 - SamW-TelAviv.pptxSH 1 - SES 5 - SamW-TelAviv.pptx
SH 1 - SES 5 - SamW-TelAviv.pptx
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
The Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataThe Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB Data
 
Planning Your Web Build - The Blueprint for Digital Performance
Planning Your Web Build - The Blueprint for Digital PerformancePlanning Your Web Build - The Blueprint for Digital Performance
Planning Your Web Build - The Blueprint for Digital Performance
 

More from MongoDB

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Advanced Schema Design Patterns for MongoDB

  • 1. #MDBW17 Daniel Coupal, Senior Curriculum Engineer ADVANCED SCHEMA DESIGN PATTERNS
  • 2. #MDBW17 WHO AM I? { "name": "Daniel Coupal", "jobs_at_MongoDB": [ { "job": "Senior Curriculum Engineer", "from": new Date("2016-11") }, { "job": "Senior Technical Service Engineer", "from": new Date("2013-11") } ], "previous_jobs": [ "Consultant", "Developer", "Manager Quality & Tools Team", "Manager Software Team", "Tools Developer" ], "likes": [ "food", "beers", "movies", "MongoDB" ] }
  • 3. #MDBW17 PATTERN • The "Gang of Four": A design pattern systematically names, explains, and evaluates an important and recurring design in object-oriented systems • MongoDB systems can also be built using its own patterns
  • 4. #MDBW17 WHY THIS TALK? 1) Enable teams to use a common methodology and vocabulary when designing schemas for MongoDB 2) Giving you the ability to model schemas using building blocks 3) Less art and more methodology
  • 5. #MDBW17 WMDB - WORLD MOVIE DATABASE Any events, characters and entities depicted in this presentation are fictional. Any resemblance or similarity to reality is entirely coincidental
  • 6. #MDBW17 WMDB - WORLD MOVIE DATABASE First iteration 3 collections: A. movies B. moviegoers C.screenings
  • 7. #MDBW17 MISSION POSSIBLEOur mission, should we decide to accept it, is to fix this solution, so it can perform well and scale. As always, should I or anyone in the audience do it without training, WMDB will disavow any knowledge of our actions. This tape will self-destruct in five seconds. Good luck!
  • 8. #MDBW17 WHY WE CREATE MODELS Ensure: • Good performance • Scalability despite a set of constraints ➡ • Hardware ‒ RAM faster than Disk ‒ Disk cheaper than RAM ‒ Network latency ‒ Reduce costs $$$ • Database Server ‒ Maximum size for a document ‒ Atomicity of a write • Data set ‒ Size of data
  • 9. #MDBW17 HOWEVER … • Don’t over-design! • Design for: ‒ Performance ‒ Scalability ‒ Simplicity
  • 10. #MDBW17 CATEGORIES OF PATTERNS • Representation ‒ Attribute ✓ ‒ Tree ‒ Pre-Allocation • Frequency of access ‒ Subset ✓ ‒ Approximation ✓ • Grouping ‒ Computed ✓ ‒ Overflow ✓ ‒ Bucket
  • 11. #MDBW17 ISSUE #1: TOO MANY OPTIONAL FIELDS { title: "Moonlight", ... release_USA: "2016/09/02", release_Mexico: "2017/01/27", release_France: "2017/02/01", release_Festival_Mill_Valley: "2017/10/10" } Would need the following indexes: { release_USA: 1 } { release_Mexico: 1 } { release_France: 1 } ... { release_Festival_Mill_Valley: 1 } ...
  • 12. #MDBW17 PATTERN #1: ATTRIBUTES • Easy to index, for example: { "releases.location":1, "releases.date":1 }
  • 13. #MDBW17 PATTERN #1: ATTRIBUTES Problem: • Fields present in only a small subset of documents • Lots of those fields • Common characteristic to search across those fields together Use cases: • Product attributes like ‘color’, ‘size’, ‘dimensions’, ... • Release dates of a movie in different countries, festivals
  • 14. #MDBW17 SUMMARY: ATTRIBUTES Solution: • Field pairs in an array • Easy to extend with a qualifier, for example: ‒ {descriptor: "price", qualifier: "euros", value: Decimal(100.00)} Benefits: • Allow for non deterministic list of attributes • Easy to index
  • 15. #MDBW17 ISSUE #2: WORKING SET DOESN’T FIT IN RAM Possible solutions: A. Reduce the size of your working set B. Add more RAM per machine C. Start sharding or add more shards
  • 16. #MDBW17 WHY CAN’T WE HAVE MORE RAM? Elon Musk is buying all the metal for his colony on Mars
  • 17. #MDBW17 PATTERN #2: SUBSET In this example, we can: • Limit the list of actors and crew to 20 • Limit the embedded reviews to the top 20
  • 18. #MDBW17 PATTERN #2: SUBSET Problem: • There is a 1-N or N-N relationship, and only few documents from need to be shown always • Only infrequently do you need to pull all of the depending documents Use cases: • Main actors of a movie • List of reviews or comments
  • 19. #MDBW17 SUMMARY: SUBSET Solution: • Keep duplicates of a small subset of fields in the main collection Benefits: • Allows for fast data retrieval and a reduced working set size • One query brings all the information needed for the "main page"
  • 20.
  • 21. #MDBW17 PATTERN ASPECT: CONSISTENCY • How duplication is handled A. Update both source and target in real time B. Update target from source at regular intervals. Examples: o Most popular items => update nightly o Revenues from a movie => update every hour o Last 10 reviews => update hourly? daily?
  • 22. #MDBW17 ISSUE #3: REPEATED COMPUTATIONS { title: "Your Name", ... viewings: 5,000 viewers: 385,000 revenues: 5,074,800 }
  • 23. #MDBW17 PATTERN #3: COMPUTED For example: • Apply a sum, count, ... • rollup data by minute, hour, day • As long as you don’t mess with your source, you can recreate the rollups
  • 24. #MDBW17 PATTERN #3: COMPUTED Problem: • There is data that needs to be computed • The same calculations would happen over and over • Reads outnumber writes: ‒ example: 1K writes per hour vs 1M read per hour Use cases: • Have revenues per movie showing, want to display sums • Time series data, Event Sourcing
  • 25. #MDBW17 SUMMARY: COMPUTED Solution: • Apply a computation or operation on data and store the result Benefits: • Avoid re-computing the same thing over and over • Replaces a view
  • 27. #MDBW17 PATTERN #4: APPROXIMATION • Only increment once in X iterations • Increment by X
  • 28. #MDBW17 PATTERN #4: APPROXIMATION Problem: • Data is difficult to calculate correctly • May be too expensive to update the document every time to keep an exact count • No one gives a damn if the number is exact Use cases: • Population of a country • Web site visits
  • 29. #MDBW17 SUMMARY: APPROXIMATION Solution: • Fewer stronger writes Benefits: • Less writes, reducing contention on some documents
  • 30. #MDBW17 ISSUE #5: OUTLIERS DRIVING OUR SOLUTION • Trying to model for the worst case
  • 31. #MDBW17 I WANT TO BE AN EXTRA! • Not the best way to be noticed
  • 32. #MDBW17 PATTERN #5: OVERFLOW Each group of extras is put in a bucket of 1000. If we fill a bucket, we create a new one. Also known as the "Justin Bieber" pattern
  • 33. #MDBW17 PATTERN #5: OVERFLOW Problem: • There is a 1-N relationship • N can be embedded or referenced, except for few outliers • The list of references may not even fit into an array • You don’t want the outliers to drive your overall design Use cases: • Some very popular people with a huge list of followers • Movie with a ton of actors
  • 34. #MDBW17 SUMMARY: OVERFLOW Solution: • Have a field marking a document as an outlier • Do different queries for the outliers Benefits: • The design is not driven by few outliers. However, you will need to handle the outliers on the application side
  • 35. #MDBW17 OTHER PATTERNS • Bucket • Pre-allocation • Tree(s)
  • 37. #MDBW17 TAKE AWAYS • Simple grouping from tables to collections is not optimal • Learn a common vocabulary for designing schemas with MongoDB • Use patterns as "plug-and-play" for your future designs ‒ Attribute ‒ Subset ‒ Computed ‒ Approximation ‒ Overflow
  • 38. #MDBW17 REFERENCES FOR COMPLETE SOLUTIONS A full design example for a given problem: • E-commerce site • Contents Management System • Social Networking • Single view • …
  • 39. #MDBW17 HOW CAN I LEARN MORE ABOUT SCHEMA DESIGN? • More patterns in the published form of this presentation • MongoDB in-person training courses on Schema Design • Upcoming Online course at MongoDB University: ‒ https://university.mongodb.com ‒ M220 Data Modeling
  • 40. #MDBW17 THANK YOU FOR USING MONGODB! daniel.coupal@mongodb.com

Editor's Notes

  1. Welcome [Remember] Beware of transitions, keep them smooth [TODOs] Add the page numbers Drawing of a working set Consider removing ":" in the slide titles Consider changing "revenues" => revenue, in few slides More on the value and use cases for each pattern
  2. Previous Jobs, Order of likes, =>Gang of Four I like Food, Beers and Movies … and MongoDB. My boss probably hopes it is not in this order. My inspiration for this talk comes from the "Gang of Four". How many of you are familiar with the "Gang of Four"?
  3. Building blocks, Some patterns, => Same for MongoDB Basically the ones who wrote this book on "Design Patterns" GOF are Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides https://en.wikipedia.org/wiki/Design_Patterns Key words are "Elements of Reusable Software" Assemble their experience on designing and implementing software over the years They found that a lot of the solutions were sharing some "patterns" Examples of patterns from "Design Patterns" Types: Creational (5), Structural (7), Behavioral (11) Singleton (restrict the creation to a single object for a given class) Observer (number of objects to see an event) Command (user operation) Decorator (embellishing a UI element) Memento (ability to restore an object to a previous state) … So, they went and made a catalog of those "patterns". The idea is enable people who write software to share a common language and have building blocks for solutions.
  4. 10 Years, Vocabulary, Building Blocks, "Art", => Example We use that contents in our internal trainings, however is it the first time we are presenting it at a conference, well… including the "data modeling" workshop we ran yesterday. The goal is not to teach you about doing schema design. I am expecting you to either have done some with MongoDB or with a Relational Database My goal is help you formalize the process of creating schemas for MongoDB, help you work in team by sharing visuals, vocabulary
  5. Entities In order to illustrate this talk, let's assume there is a site called the "World Movie Database". This site is so popular that everyone goes there on Thursdays before the release of new movies and it crashes the site. Then some people tried to migrate the site to a NoSQL database, MongoDB obviously.
  6. Collections, grouping not optimal, =>accept challenge This is the first try of trying to move the schema from Relational to MongoDB. There are 3 collections: movies, moviegoers and screenings. Simply grouping entities into collections is not optimal. The solution using this design did not perform much better than the previous one. This is still normalized. When you remove this restriction, duplication is fine, 1-1 relationships are fine. You open the door to some important transformations. Those will be our patterns. [NOTE] Use "Sync Visibility" once you activate the color layer to also see it in the PNG file.
  7. Perform & Scale, without training, disavow Our goal, no need to say, is to fix this website before it gets the same fate as this tape recorder.
  8. Performance & scalability, "air" Before we get going, let's just answer why we create models. In a perfect world, you don't really have to model. I mean if everything is super fast and resources are abundant, you really don't care where and how data is stored Every day I get up I don't make plans on how I will breathe air. However if you go to space or under water, you will need a "design" that will let you get the amount of air you need.
  9. Design is optional, cost of developer, 5 or 10 shards? If performance is not an issue, meaning you have resources to spare, then you are likely to model for simplicity. The reason is that software engineers are very expensive. You may not think so, but your manager does. If you need to shard the database, it is likely that performance is very important Why using 10 shards, if you can reduce the number of operations (reads and writes) by 2 and be able to do the same with 5 shards?
  10. GoF, top 5 patterns in order, We will use patterns, like the Gang of Four. Most patterns can be grouped in 3 categories. We will cover those patterns identified with check marks in this presentation. Also, I will cover the patterns in order of importance, or so. For the other ones, I will refer you to the slides of this presentation and subsequent content we will have on the subject.
  11. How do I search on movies being released on a given date in the USA? The same would apply to products you could see on E-commerce site. For example, clothes may have a size that is expressed as S, M, L, while for some other products like a laptop, size would be something like 13", 15"
  12. If you noticed from my personal info, I did use that pattern. That allowed me to list my jobs at MongoDB and associate them with a given date.
  13. Inventory of things to insure Polymorphic entities Vehicles: submarine, car
  14. "Adding a qualifier on the attribute" may be "currency"
  15. Working set, imagine no more RAM With everyone pounding on the WMDB site, it was observed that the working set does not fit in memory. What can you do? Looking at the design we see that we are putting all the actors and all reviews for a given movie in the main docume [TODO] Add a drawing showing what the working set is
  16. We are running out of metals, so we can not produce more RAM and have to do with the technology that was available in 2017. That said, this is an interesting thought, as RAM become cheaper and bigger, it will have an impact on working sets, and on your designs in general. Your design today may not be optimal in 5 years. So using MongoDB, which that let you change your schema easily, is a terrific advantage.
  17. The collection "castandcrew" contains all the actors, but also the producers, costume makers, stunts, etc. For this pattern to be worth it, it has to have a fair amount of information left aside.
  18. Top level information for a first page If this is slow, you may not keep your users on the site You want them to validate that this is what they want, then dig for more if needed
  19. Let's take a pause there. Don't go get popcorn, not yet, this is just an intermission from our pattern list. [TODO] make this "intermission" more appealing
  20. Let’s pause from our pattern list, and let’s examine a characteristic or aspect of some patterns.
  21. As you may guess, people pay attention to the popularity of the movies. So, metrics like "revenues" and "viewers" are really important. In the current design, those numbers are calculated every time the page of a movie is displayed. Let’s calculate those numbers once in a while and stick the results on the page instead.
  22. Also refer to "Rolled up" as CQRS - Command Query Responsibility Segregation According to Bryan, that sounds good at a Party.
  23. Another thing that was observed with the current design is that trying to keep track of all page views of the site resulted in very poor performance. That was seen for both MMAPv1 and WT. In MMAPv1, you get a lot of threads looking for the write lock. While with WT, you get a lot of write conflicts that need to be retried. One solution is to record "good enough" numbers. Well no one cares that the count is 100 millions or 100 millions and few. What is the tolerance level here? Let’s assume 1000. In this case, we will let the application update the page views by 1000, however only 1/1000th of the time. Statistically, we should get a result very close to the exact count, however doing only 1/1000th of the writes. If you make the parallel to a movie, we never see a movie as a continuous image, the movie is made by displaying 24 static images per second, however this is enough to our eyes to not see the discontinuties. How do you do that? Let’s have the application run a (X mod 1000) operation, where X is a random number. If the result is 0, let’s update the counter by 1000.
  24. You can have a counter. Once you reach the count, you do the write. Or you can use a random generator and when you get a specific value, you do the write. As you guess, this simple pattern is also applicable to Relational databases. … it is just that NoSQL people have more tricks to handle performance bottlenecks.
  25. Let's assume that people could pick the movie to be an extra on. Some movies are so popular that thousands and thousands would pick those movies to be an extra on. If you have one a year that breaks your design, you may not want to have this outlier drive your solution.
  26. On the top of the "bucket" pattern, we have this concept of "handing the special situation"
  27. - Few million references would not even fit into an embedded array. And if it did, you would not want to construct a query by passing a million values to the $in operator.
  28. We touch a little bit the bucket pattern when we looked at the outlier one. The bucket pattern let you group X sub-documents into one document. When the bucket is full, you create another one. Pre-allocation will be the case where you pre-create an array of cells to have the reads and writes easily access the elements. This is a very important pattern if you are using MMAPv1, as continuously growing an array can have a negative effect. With Wired Tiger it is not as crucial, however may make the code in the application simpler. As for Trees are commonly represented by either having one node per document, where you can list the parent, the children, the ancestors, or a combination of those
  29. [TODO] I need another title! Let’s wrap up what we covered. We did use a fictional site, however all the patterns we used would also apply to "Internet of Things", "Single View", "E-commerce" solutions.
  30. 10 years, future data big or not square, becoming an expert MongoDB celebrates 10 years … very soon. We are able to identify patterns because we have seen a lot of models with MongoDB over those first 10 years. Those are "plug-and-play" elements that let you go faster in your designs. We do believe MongoDB has a bright future. Most data that could be put in a Relational Database is already there. We are left with: Data this is "not square", meaning it does not fit well in square tables. Large datasets We believe the document model and the scalability of MongoDB are prime to store that data Ensure you are ready for the future by becoming an expert on MongoDB and how to model for it
  31. My goal was to introduce you to patterns, however if you want more complete solutions to common problems, there are few good books out there. Let me point you to those 2: The Little Mongo DB Schema Design Book Paperback, by Christian Kvalheim MongoDB Applied Design Patterns, by Rick Copeland
  32. I am leaving you with where you can find more information about schema design M220 is likely to be available in Q4 2017
  33. Thanks you for attending my presentation, and this conference, but above all: Thank you for using MongoDB!