SlideShare ist ein Scribd-Unternehmen logo
1 von 18
mongo @ ex.fm
 Lucas Hrabovsky
      CTO
   #MongoPGH
ex.fm turns websites into CD’s
browser extensions
_id and indexes
• Bad Ideas
  – ObjectId("4fb284…")
  – Big Compound Indexes
  – Long,VariableWidthStringsMissIndexes
• Good Ideas
  – Make _id mean something
  – Fixed Width Hashes
  – Use _id as a compound index
activity feeds: first attempt
{“_id”: “201109122304-lucas-dan-c7dede43…”,
"username”: “lucas”, "created”: 201109122304,
"actor”: “dan”, “verb”: “love”}


db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟})
.sort({„created‟: -1})



Working just fine for 4MM documents, but getting slow…
new version of activity feeds
{“_id”: “201109122304-lucas-dan-
c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”:
lucas-love-201109122304, "actor”: “dan”}


db.user.feed.find({„vid‟: /^lucas-/})
.sort({„vid‟: -1})

Fast for all 3 use cases!
removing indexes pays off




Don‟t need to buy more/bigger machines!
sites! sites! sites!
padding factor
•   Variable document size
•   Allocate for the latest and fattest
•   Document moves
•   Can be very inefficient
•   More RAM!
•   Pre-allocate to prevent moves
unbounded embedded lists
•   Useful for followers, favorites
•   Good for a few things, bad for lots
•   Constantly bumping up padding factor
•   Lots of document moves
a metaphor
     • You run a coffee shop and can buy only
       one size of cup. Which size do you buy?
     • On average, each customer has only one
       cup
     • Heavy drinkers have hundreds of cups




credit: Macintex macintex.deviantart.com
bucketing!
•   Split list across multiple documents
•   Median number of items = bucket size
•   Pre-allocate
•   Easy seeking and traversal
•   Much faster
hey charts!
site.meta 1                         site.meta 2

site.songs 1                             site.songs 2




  Allocated and unused

  Allocated and full of data
same charts when using
                bucketing
site.meta 1                               site.meta 2

site.songs 1 - 1               site.songs 2 - 1    site.songs 2 - 2



site.songs 1 -2                site.songs 2 - 3    site.songs 2 - 4



                               site.songs 2 - 5    site.songs 2 -6




  Allocated and unused

  Allocated and full of data
doesn’t work for everything…
• Picking right bucket size
• Defragging
• Random insertion
  – Easy for things you don‟t much care about the
    order of
  – More difficult is you‟re going to insert and
    change the order later
micro documents
db.site.songs.find({_id:
/^bfc25de08d964a8a41226c6016dd7753-
/}).sort({_id:-1})

{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" :
18436532 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" :
18804590 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" :
18804591 }
paying it back
• Bent mongoengine to make this easy
• Follow github.com/exfm
• Also added tooling for
  – Trace all queries
  – Aggregate tracing by request middleware
  – Raise exceptions when queries miss an index
thanks!

  lucas@ex.fm
github.com/exfm

Weitere ähnliche Inhalte

Ă„hnlich wie Optimize MongoDB performance with micro documents and bucketing

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkinskiwilkins
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 
Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architectureKaushik Paranjape
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooAndrew Brust
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooAndrew Brust
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBFredrik Björk
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsMatias Cascallares
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsEDB
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Saltmarch Media
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBMarakana Inc.
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019Dave Nielsen
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analyticsMongoDB
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics MongoDB
 

Ă„hnlich wie Optimize MongoDB performance with micro documents and bucketing (20)

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data Models
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics
 

KĂĽrzlich hochgeladen

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

KĂĽrzlich hochgeladen (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Optimize MongoDB performance with micro documents and bucketing

  • 1. mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH
  • 2. ex.fm turns websites into CD’s
  • 4. _id and indexes • Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes – Long,VariableWidthStringsMissIndexes • Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index
  • 5. activity feeds: first attempt {“_id”: “201109122304-lucas-dan-c7dede43…”, "username”: “lucas”, "created”: 201109122304, "actor”: “dan”, “verb”: “love”} db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟}) .sort({„created‟: -1}) Working just fine for 4MM documents, but getting slow…
  • 6. new version of activity feeds {“_id”: “201109122304-lucas-dan- c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”: lucas-love-201109122304, "actor”: “dan”} db.user.feed.find({„vid‟: /^lucas-/}) .sort({„vid‟: -1}) Fast for all 3 use cases!
  • 7. removing indexes pays off Don‟t need to buy more/bigger machines!
  • 9. padding factor • Variable document size • Allocate for the latest and fattest • Document moves • Can be very inefficient • More RAM! • Pre-allocate to prevent moves
  • 10. unbounded embedded lists • Useful for followers, favorites • Good for a few things, bad for lots • Constantly bumping up padding factor • Lots of document moves
  • 11. a metaphor • You run a coffee shop and can buy only one size of cup. Which size do you buy? • On average, each customer has only one cup • Heavy drinkers have hundreds of cups credit: Macintex macintex.deviantart.com
  • 12. bucketing! • Split list across multiple documents • Median number of items = bucket size • Pre-allocate • Easy seeking and traversal • Much faster
  • 13. hey charts! site.meta 1 site.meta 2 site.songs 1 site.songs 2 Allocated and unused Allocated and full of data
  • 14. same charts when using bucketing site.meta 1 site.meta 2 site.songs 1 - 1 site.songs 2 - 1 site.songs 2 - 2 site.songs 1 -2 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 -6 Allocated and unused Allocated and full of data
  • 15. doesn’t work for everything… • Picking right bucket size • Defragging • Random insertion – Easy for things you don‟t much care about the order of – More difficult is you‟re going to insert and change the order later
  • 16. micro documents db.site.songs.find({_id: /^bfc25de08d964a8a41226c6016dd7753- /}).sort({_id:-1}) { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" : 18436532 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" : 18804590 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" : 18804591 }
  • 17. paying it back • Bent mongoengine to make this easy • Follow github.com/exfm • Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index