SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Ger Hartnett
Director of Technical Services (EMEA), MongoDB @ghartnett #MongoDB
Tales from the Field
Part two: Fixing Sub-optimal Performance in
a Retail Application
Or:
●Cautionary Tales
●Don’t solve the wrong problems
●Bad schemas & shard keys hurt ops
too
●The main talk should take 30-35 minutes
●You can submit questions via the chat box
●We’ll answer as many as possible at the end
●We will send the slides and recording
tomorrow via email
●The final webinar in the series will take place
on Thursday 21rd April – 14:00 BST | 15:00
CEST
Before we start
●You work in operations
●You work in development
●You have a MongoDB system in production
●You have contacted MongoDB Technical
Services (support)
●You attended the last webinar (part1)
A quick poll - add a word to the
chat to let me know your
perspective
●We collect - observations about common
mistakes - to share the experience of many
●Names have been changed to protect the
(mostly) innocent
●No animals were harmed during the making
of this presentation (but maybe some DBAs
and engineers had light emotional scarring)
●While you might be new to MongoDB we
have deep experience that you can leverage
Stories
1. Discovering a DR flaw during a data
centre outage
2. Complex documents, memory and
an upgrade “surprise”
3. Wild success “uncovers” the wrong
shard key
The Stories (part two today)
Story #1: Quick Review
Story #1: Recovering from a
disaster
●Prospect in the process of signing up for a
subscription
●Called us late on Friday, data centre power
outage and 30+ (11 shards) servers down
●When they started bringing up the first
shard, the nodes crashed with data
corruption
●17TB of data, very little free disk space,
JOURNALLING DISABLED!
Recovering each shard
1.Start secondary
read only
2.Mount NFS
storage for repair
3.Repair former
primary node
4.Iterative rsync to
seed a secondary
Secondary
Primary
Secondary
Key takeaways for you
●If you are departing significantly from
standard config, check with us (i.e. if you
think journalling is a bad idea)
●Two DC in different buildings on different
flood plains, not in the path of the same
storm (i.e. secondaries in AWS)
●DR/backups are useless if you haven’t
tested them
Story #2: Complex documents,
memory and an upgrade
“surprise”
●Well established ecommerce site selling
diverse goods in 20+ countries
●After switching to wired tiger in production,
performance dropped, this is the opposite of
what they were expecting
{
_id: 375
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... >
inventory: 423
}
Product Catalog: Original
Schema
What’s good about this schema?
● Each document contains all the data about a given
product, across all languages/locales
● Very efficient way to retrieve the English, French,
German, etc. translations of a single product’s
information in one query
However……
That is not how the product data is
actually used
(except perhaps by translation staff)
db
db.catalog.update( { _id : 375 }, { $inc: { count: -1 } } )
db.catalog.find( { _id : 375 } , { en_US : true } );
db.catalog.find( { _id : 375 } , { fr_FR : true } );
db.catalog.find( { _id : 375 } , { de_DE : true } );
... and so forth for other locales ...
Dominant Query Patterns
Which means……
The Product Catalog’s data model
did not fit the way the data was
accessed.
Consequences
●WiredTiger reads/rewrites the whole document
●Each document contained ~20x more data than
any common use case needed
●MongoDB lets you request just a subset of a
document’s contents (using a projection), but…
o Typically whole document loaded into RAM
●There are other overheads (like readahead)
{ _id: 42,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
<READAHEAD OVERHEAD>
{ _id: 709,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
<READAHEAD OVERHEAD>
{ _id: 3600,
en_US : { name : ..., description : ..., <etc...> },
en_GB : { name : ..., description : ..., <etc...> },
fr_FR : { name : ..., description : ..., <etc...> },
de_DE : ...,
de_CH : ...,
<... and so on for other locales... > }
Visualising the read problem
- Data in RED are loaded into RAM
and used.
- Data in BLUE take up memory but
are not required.
- Readahead padding in GREEN
makes things even more inefficient
More RAM? It’s not that simple
What did we recommend?
● Design for your use case, your most common query
pattern
o In this case: 99.99% of queries want the product
data for exactly one locale at a time
o Move the frequently changing fields to a new
collection
● Eliminate inefficiencies on the system
o Make reading from disk less wasteful, maximise I/O
capabilities by reducing readahead
{ _id: "375-en_US",
name : ..., description : ..., <etc...> }
{ _id: "375-en_GB",
name : ..., description : ..., <etc...> }
{ _id: "375-fr_FR",
name : ..., description : ..., <etc...> }
... and so on for other locales ...
db.inventory
{ _id: "375", count : NumberLong(1234), <etc...> }
Product Catalog: Eventual
Schema
Aftermath & lessons learned
●Faster updates
●Queries induced minimal overhead
●Greater than 20x distinct products fit in
memory at once
●Disk I/O utilization reduced
●UI latency decreased
Key Takeaways
●When doing a major version/storage-engine
upgrade, test in staging with some
proportion of production data/workload
●Sometimes putting everything into one
document is counter productive
Story #3: Quick Preview
2 More Shards….
Story #3: Wild success uncovers
the wrong shard key
●Started out as error “[Balancer] caught
exception … tag ranges not valid for: db.coll”
●11 shards, they had added 2 new shards to
keep up traffic - 400+ databases
●Lots of code changes ahead of the
Superbowl
●Spotted slow 300+s queries, decided to build
some indexes without telling us
●Production went down
Further Reading
Production notes
docs.mongodb.org/manual/administration/production-notes
Mtools
github.com/rueckstiess/mtools
Ger Hartnett
Director Technical Services (EMEA), MongoDB
@ghartnett #MongoDB
Questions?
●You can submit questions via the chat box
●We are recording and will send slides
tomorrow
●We will send the slides and recording
tomorrow via email
●Part 3: the next webinar will take place on
Thursday 21st April – 14:00 BST | 15:00
CEST
www.mongodb.com/webinars
Questions
Code GerHartnett gets 25% discount

Weitere ähnliche Inhalte

Was ist angesagt?

Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDBHardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
MongoDB
 

Was ist angesagt? (20)

Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDBWebinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
 
Tales from production with postgreSQL at scale
Tales from production with postgreSQL at scaleTales from production with postgreSQL at scale
Tales from production with postgreSQL at scale
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinar
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDBHardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDBBreaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
 
Time series databases
Time series databasesTime series databases
Time series databases
 
Common Cluster Configuration Pitfalls
Common Cluster Configuration PitfallsCommon Cluster Configuration Pitfalls
Common Cluster Configuration Pitfalls
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 

Ähnlich wie Webinar: Avoiding Sub-optimal Performance in your Retail Application

Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
MongoDB
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
bostonrb
 

Ähnlich wie Webinar: Avoiding Sub-optimal Performance in your Retail Application (20)

Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
 
MongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the FieldMongoDB Days UK: Tales from the Field
MongoDB Days UK: Tales from the Field
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
Etl confessions pg conf us 2017
Etl confessions   pg conf us 2017Etl confessions   pg conf us 2017
Etl confessions pg conf us 2017
 
PyGrunn2013 High Performance Web Applications with TurboGears
PyGrunn2013  High Performance Web Applications with TurboGearsPyGrunn2013  High Performance Web Applications with TurboGears
PyGrunn2013 High Performance Web Applications with TurboGears
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™
 
Handout3o
Handout3oHandout3o
Handout3o
 
Mongodb
MongodbMongodb
Mongodb
 
MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014) MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014)
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsA fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFs
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Lcna tutorial-2012
Lcna tutorial-2012Lcna tutorial-2012
Lcna tutorial-2012
 
Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial
 
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
 

Mehr von MongoDB

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Webinar: Avoiding Sub-optimal Performance in your Retail Application

  • 1. Ger Hartnett Director of Technical Services (EMEA), MongoDB @ghartnett #MongoDB Tales from the Field Part two: Fixing Sub-optimal Performance in a Retail Application
  • 2. Or: ●Cautionary Tales ●Don’t solve the wrong problems ●Bad schemas & shard keys hurt ops too
  • 3. ●The main talk should take 30-35 minutes ●You can submit questions via the chat box ●We’ll answer as many as possible at the end ●We will send the slides and recording tomorrow via email ●The final webinar in the series will take place on Thursday 21rd April – 14:00 BST | 15:00 CEST Before we start
  • 4. ●You work in operations ●You work in development ●You have a MongoDB system in production ●You have contacted MongoDB Technical Services (support) ●You attended the last webinar (part1) A quick poll - add a word to the chat to let me know your perspective
  • 5. ●We collect - observations about common mistakes - to share the experience of many ●Names have been changed to protect the (mostly) innocent ●No animals were harmed during the making of this presentation (but maybe some DBAs and engineers had light emotional scarring) ●While you might be new to MongoDB we have deep experience that you can leverage Stories
  • 6. 1. Discovering a DR flaw during a data centre outage 2. Complex documents, memory and an upgrade “surprise” 3. Wild success “uncovers” the wrong shard key The Stories (part two today)
  • 8. Story #1: Recovering from a disaster ●Prospect in the process of signing up for a subscription ●Called us late on Friday, data centre power outage and 30+ (11 shards) servers down ●When they started bringing up the first shard, the nodes crashed with data corruption ●17TB of data, very little free disk space, JOURNALLING DISABLED!
  • 9. Recovering each shard 1.Start secondary read only 2.Mount NFS storage for repair 3.Repair former primary node 4.Iterative rsync to seed a secondary Secondary Primary Secondary
  • 10. Key takeaways for you ●If you are departing significantly from standard config, check with us (i.e. if you think journalling is a bad idea) ●Two DC in different buildings on different flood plains, not in the path of the same storm (i.e. secondaries in AWS) ●DR/backups are useless if you haven’t tested them
  • 11. Story #2: Complex documents, memory and an upgrade “surprise” ●Well established ecommerce site selling diverse goods in 20+ countries ●After switching to wired tiger in production, performance dropped, this is the opposite of what they were expecting
  • 12. { _id: 375 en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > inventory: 423 } Product Catalog: Original Schema
  • 13. What’s good about this schema? ● Each document contains all the data about a given product, across all languages/locales ● Very efficient way to retrieve the English, French, German, etc. translations of a single product’s information in one query
  • 14. However…… That is not how the product data is actually used (except perhaps by translation staff)
  • 15. db db.catalog.update( { _id : 375 }, { $inc: { count: -1 } } ) db.catalog.find( { _id : 375 } , { en_US : true } ); db.catalog.find( { _id : 375 } , { fr_FR : true } ); db.catalog.find( { _id : 375 } , { de_DE : true } ); ... and so forth for other locales ... Dominant Query Patterns
  • 16. Which means…… The Product Catalog’s data model did not fit the way the data was accessed.
  • 17. Consequences ●WiredTiger reads/rewrites the whole document ●Each document contained ~20x more data than any common use case needed ●MongoDB lets you request just a subset of a document’s contents (using a projection), but… o Typically whole document loaded into RAM ●There are other overheads (like readahead)
  • 18. { _id: 42, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } <READAHEAD OVERHEAD> { _id: 709, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } <READAHEAD OVERHEAD> { _id: 3600, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > } Visualising the read problem - Data in RED are loaded into RAM and used. - Data in BLUE take up memory but are not required. - Readahead padding in GREEN makes things even more inefficient
  • 19. More RAM? It’s not that simple
  • 20. What did we recommend? ● Design for your use case, your most common query pattern o In this case: 99.99% of queries want the product data for exactly one locale at a time o Move the frequently changing fields to a new collection ● Eliminate inefficiencies on the system o Make reading from disk less wasteful, maximise I/O capabilities by reducing readahead
  • 21. { _id: "375-en_US", name : ..., description : ..., <etc...> } { _id: "375-en_GB", name : ..., description : ..., <etc...> } { _id: "375-fr_FR", name : ..., description : ..., <etc...> } ... and so on for other locales ... db.inventory { _id: "375", count : NumberLong(1234), <etc...> } Product Catalog: Eventual Schema
  • 22. Aftermath & lessons learned ●Faster updates ●Queries induced minimal overhead ●Greater than 20x distinct products fit in memory at once ●Disk I/O utilization reduced ●UI latency decreased
  • 23. Key Takeaways ●When doing a major version/storage-engine upgrade, test in staging with some proportion of production data/workload ●Sometimes putting everything into one document is counter productive
  • 24. Story #3: Quick Preview 2 More Shards….
  • 25. Story #3: Wild success uncovers the wrong shard key ●Started out as error “[Balancer] caught exception … tag ranges not valid for: db.coll” ●11 shards, they had added 2 new shards to keep up traffic - 400+ databases ●Lots of code changes ahead of the Superbowl ●Spotted slow 300+s queries, decided to build some indexes without telling us ●Production went down
  • 27. Ger Hartnett Director Technical Services (EMEA), MongoDB @ghartnett #MongoDB Questions?
  • 28. ●You can submit questions via the chat box ●We are recording and will send slides tomorrow ●We will send the slides and recording tomorrow via email ●Part 3: the next webinar will take place on Thursday 21st April – 14:00 BST | 15:00 CEST www.mongodb.com/webinars Questions
  • 29. Code GerHartnett gets 25% discount

Hinweis der Redaktion

  1. Field/Trenches
  2. What do the rest of you do?
  3. What do the rest of you do?
  4. Some borrowed, some merged into a single narrative Some of the people that inspired them may well be here in this room today
  5. Bill's Bulk Updates randomly affected an ever larger data set. In order to cope with the database size, Bill added more shards. The cluster scaled linearly, as intended.
  6. Well, it might fix things, but it’s expensive and the real problem is the efficiency
  7. Bill's Bulk Updates randomly affected an ever larger data set. In order to cope with the database size, Bill added more shards. The cluster scaled linearly, as intended.
  8. Field/Trenches
  9. What do the rest of you do?