SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Charity Majors
@mipsytipsy
Thursday, June 20, 13
Managing a maturing MongoDB
ecosystem
Thursday, June 20, 13
automating with chef
performance tuning
disaster recovery
Thursday, June 20, 13
chef.
Thursday, June 20, 13
Basic replica set
Thursday, June 20, 13
How do I chef that?
... grab the AWS and mongodb cookbooks,
create a site wrapper cookbook
Thursday, June 20, 13
make a role for your cluster,
launch some nodes,
Thursday, June 20, 13
initiate the replica set,
... and you’re done.
Thursday, June 20, 13
Adding snapshots
Thursday, June 20, 13
adding RAID for EBS volumes
Thursday, June 20, 13
this will bootstrap a new node for the cluster from snapshots
with this role ...
Thursday, June 20, 13
multiple clusters
distinct cluster name, backup host, backup volumes
Thursday, June 20, 13
sharding
Thursday, June 20, 13
assign a shard name per cluster, per role
treat them like ordinary replica sets
Thursday, June 20, 13
Arbiters
• Mongod processes that do nothing but vote
• Highly reliable
• To provision an arbiter, use the LWRP
• Easy to run multiple arbiters on a single host
Thursday, June 20, 13
arbiter LWRP
Thursday, June 20, 13
replica set with arbiters
Thursday, June 20, 13
run multiple arbiters on a single host:
Thursday, June 20, 13
Managing votes with arbiters
Thursday, June 20, 13
tuning and performance.
Thursday, June 20, 13
resources and provisioning
tuning your filesystem
snapshotting and warmups
fragmentation
Thursday, June 20, 13
Provisioning tips
• Memory is your primary scaling constraint
• Your working set must fit in to memory
• in 2.4, estimate with:
• Page faults? Your working set may not fit
Thursday, June 20, 13
Disk options
• If you’re on Amazon:
• EBS
• Dedicated SSD
• Provisioned IOPS
• Ephemeral
• If not:
• use SSDs!
Thursday, June 20, 13
EBS classic
EBS with
PIOPS:
... just say no to EBS
Thursday, June 20, 13
SSD
(hi1.4xlarge)
• 8 cores
• 60 gigs RAM
• 2 1-TB SSD drives
• 120k random reads/sec
• 85k random writes/sec
• expensive! $2300/mo on demand
Thursday, June 20, 13
PIOPS
• Up to 2000 IOPS/volume
• Up to 1024 GB/volume
• Variability of < 0.1%
• Costs double regular EBS
• Supports snapshots
• RAID together multiple volumes
for more storage/performance
Thursday, June 20, 13
• multiply that by 2-3x depending on your spikiness
Estimating PIOPS
• estimate how many IOPS to provision with the “tps”
column of sar -d 1
Thursday, June 20, 13
Ephemeral
Storage
• Cheap
• Fast
• No network latency
• No snapshot capability
• Data is lost forever if you stop or
resize the instance
Thursday, June 20, 13
Filesystem and limits
• Raise file descriptor limits
• Raise connection limits
• Mount with noatime and nodiratime
• Consider putting the journal on a separate volume
Thursday, June 20, 13
Blockdev
• Your default blockdev is probably wrong
• Too large? you will underuse memory
• Too small? you will hit the disk too much
• Experiment.
Thursday, June 20, 13
Snapshot best practices
• Set priority = 0
• Set hidden = 1
• Consider setting votes = 0
• Lock mongo or stop mongod before snapshot
• Consider running continuous compaction on
snapshot node
Thursday, June 20, 13
Restoring from snapshot
• EBS snapshot will lazily-load blocks from S3
• run “dd” on each of the data files to pull blocks down
• Always warm up a secondary before promoting
• warm up both indexes and data
• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/
• in mongodb 2.2 and above you can use the touch command:
Thursday, June 20, 13
Fragmentation
• Your RAM gets fragmented too!
• Leads to underuse of memory
• Deletes are not the only source of fragmentation
• Repair, compact, or resync regularly
Thursday, June 20, 13
3 ways to fix fragmentation:
• Re-sync a secondary from scratch
• hard on your primary; rs.syncFrom() a secondary
• Repair a secondary
• can cause small discrepancies in your data
• Run continuous compaction on your snapshot
node
• won’t reset padding factors
• not appropriate if you do lots of deletes
Thursday, June 20, 13
Fragmentation is terrible
Thursday, June 20, 13
Upgrade!
mongo is getting faster. :)
Thursday, June 20, 13
disasters and recovery.
Thursday, June 20, 13
Finding bad queries
• db.currentOp()
• mongodb.log
• profiling collection
Thursday, June 20, 13
db.currentOp()
• Check the queue size
• Any indexes building?
• Sort by num_seconds
• Sort by num_yields, locktype
• Consider adding comments to your queries
• Run explain() on queries that are long-running
Thursday, June 20, 13
mongodb.log
• Configure output with --slowms
• Look for high execution time, nscanned, ntoreturn
• See which queries are holding long locks
• Match connection ids to IPs
Thursday, June 20, 13
system.profile collection
• Enable profiling with db.setProfiling()
• Does not persist through restarts
• Like mongodb.log, but queryable
• Writes to this collection incur some cost
• Use db.system.profile.find() to get slow queries for
a certain collection, time range, execution time, etc
Thursday, June 20, 13
• Know what your tipping point looks like
• Don’t switch your primary or restart
• Do kill queries before the tipping point
• Write your kill script before you need it
• Don’t kill internal mongo operations, only queries.
... when queries pile up ...
Thursday, June 20, 13
can’t elect a master?
• Never run with an even number of votes (max 7)
• You need > 50% of votes to elect a primary
• Set your priority levels explicitly if you need
warmup
• Consider delegating voting to arbiters
• Set snapshot nodes to be nonvoting if possible.
• Check your mongo log. Is something vetoing? Do
they have an inconsistent view of the cluster state?
Thursday, June 20, 13
secondaries crashing?
• Some rare mongo bugs will cause all secondaries
to crash unrecoverably
• Never kill oplog tailers or other internal database
operations, this can also trash secondaries
• Arbiters are more stable than secondaries,
consider using them to form a quorum with your
primary
Thursday, June 20, 13
replication stops?
• Other rare bugs will stop replication or cause
secondaries to exit without a corrupt op
• The correct way to fix this is to re-snapshot off
the primary and rebuild your secondaries.
• However, you can sometimes *dangerously* repair
a secondary:
1. stop mongo
2. bring it back up in standalone mode
3. repair the offending collection
4. restart mongo again as part of the replica set
Thursday, June 20, 13
• Everything is getting vaguely slower?
• check your padding factor, try compaction
• You rs.remove() a node and get weird driver
errors?
• always shut down mongod after removing from replica set
• Huge background flush spike?
• probably an EBS or disk problem
• You run out of connection limits?
• possibly a driver bug
• hard-coded to 80% of soft ulimit until 20k is reached.
Thursday, June 20, 13
• It looks like all I/O stops for a while?
• check your mongodb.log for large newExtent warnings
• also make sure you aren’t reaching PIOPS limits
• You get weird driver errors after adding/removing/
re-electing?
• some drivers have problems with this, you may have to restart
Thursday, June 20, 13
Glossary of resources
• Opscode AWS cookbook
• https://github.com/opscode-cookbooks/aws
• edelight MongoDB cookbook
• https://github.com/edelight/chef-mongodb
• Parse MongoDB cookbook fork
• https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/
mongodb
• Parse compaction scripts and warmup scripts
• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/
• http://blog.parse.com/2013/03/26/always-be-compacting/
Thursday, June 20, 13
Charity Majors
@mipsytipsy
Thursday, June 20, 13

Weitere ähnliche Inhalte

Mehr von MongoDB

MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
 

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Managing a Maturing MongoDB Ecosystem

  • 2. Managing a maturing MongoDB ecosystem Thursday, June 20, 13
  • 3. automating with chef performance tuning disaster recovery Thursday, June 20, 13
  • 6. How do I chef that? ... grab the AWS and mongodb cookbooks, create a site wrapper cookbook Thursday, June 20, 13
  • 7. make a role for your cluster, launch some nodes, Thursday, June 20, 13
  • 8. initiate the replica set, ... and you’re done. Thursday, June 20, 13
  • 10. adding RAID for EBS volumes Thursday, June 20, 13
  • 11. this will bootstrap a new node for the cluster from snapshots with this role ... Thursday, June 20, 13
  • 12. multiple clusters distinct cluster name, backup host, backup volumes Thursday, June 20, 13
  • 14. assign a shard name per cluster, per role treat them like ordinary replica sets Thursday, June 20, 13
  • 15. Arbiters • Mongod processes that do nothing but vote • Highly reliable • To provision an arbiter, use the LWRP • Easy to run multiple arbiters on a single host Thursday, June 20, 13
  • 17. replica set with arbiters Thursday, June 20, 13
  • 18. run multiple arbiters on a single host: Thursday, June 20, 13
  • 19. Managing votes with arbiters Thursday, June 20, 13
  • 21. resources and provisioning tuning your filesystem snapshotting and warmups fragmentation Thursday, June 20, 13
  • 22. Provisioning tips • Memory is your primary scaling constraint • Your working set must fit in to memory • in 2.4, estimate with: • Page faults? Your working set may not fit Thursday, June 20, 13
  • 23. Disk options • If you’re on Amazon: • EBS • Dedicated SSD • Provisioned IOPS • Ephemeral • If not: • use SSDs! Thursday, June 20, 13
  • 24. EBS classic EBS with PIOPS: ... just say no to EBS Thursday, June 20, 13
  • 25. SSD (hi1.4xlarge) • 8 cores • 60 gigs RAM • 2 1-TB SSD drives • 120k random reads/sec • 85k random writes/sec • expensive! $2300/mo on demand Thursday, June 20, 13
  • 26. PIOPS • Up to 2000 IOPS/volume • Up to 1024 GB/volume • Variability of < 0.1% • Costs double regular EBS • Supports snapshots • RAID together multiple volumes for more storage/performance Thursday, June 20, 13
  • 27. • multiply that by 2-3x depending on your spikiness Estimating PIOPS • estimate how many IOPS to provision with the “tps” column of sar -d 1 Thursday, June 20, 13
  • 28. Ephemeral Storage • Cheap • Fast • No network latency • No snapshot capability • Data is lost forever if you stop or resize the instance Thursday, June 20, 13
  • 29. Filesystem and limits • Raise file descriptor limits • Raise connection limits • Mount with noatime and nodiratime • Consider putting the journal on a separate volume Thursday, June 20, 13
  • 30. Blockdev • Your default blockdev is probably wrong • Too large? you will underuse memory • Too small? you will hit the disk too much • Experiment. Thursday, June 20, 13
  • 31. Snapshot best practices • Set priority = 0 • Set hidden = 1 • Consider setting votes = 0 • Lock mongo or stop mongod before snapshot • Consider running continuous compaction on snapshot node Thursday, June 20, 13
  • 32. Restoring from snapshot • EBS snapshot will lazily-load blocks from S3 • run “dd” on each of the data files to pull blocks down • Always warm up a secondary before promoting • warm up both indexes and data • http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/ • in mongodb 2.2 and above you can use the touch command: Thursday, June 20, 13
  • 33. Fragmentation • Your RAM gets fragmented too! • Leads to underuse of memory • Deletes are not the only source of fragmentation • Repair, compact, or resync regularly Thursday, June 20, 13
  • 34. 3 ways to fix fragmentation: • Re-sync a secondary from scratch • hard on your primary; rs.syncFrom() a secondary • Repair a secondary • can cause small discrepancies in your data • Run continuous compaction on your snapshot node • won’t reset padding factors • not appropriate if you do lots of deletes Thursday, June 20, 13
  • 36. Upgrade! mongo is getting faster. :) Thursday, June 20, 13
  • 38. Finding bad queries • db.currentOp() • mongodb.log • profiling collection Thursday, June 20, 13
  • 39. db.currentOp() • Check the queue size • Any indexes building? • Sort by num_seconds • Sort by num_yields, locktype • Consider adding comments to your queries • Run explain() on queries that are long-running Thursday, June 20, 13
  • 40. mongodb.log • Configure output with --slowms • Look for high execution time, nscanned, ntoreturn • See which queries are holding long locks • Match connection ids to IPs Thursday, June 20, 13
  • 41. system.profile collection • Enable profiling with db.setProfiling() • Does not persist through restarts • Like mongodb.log, but queryable • Writes to this collection incur some cost • Use db.system.profile.find() to get slow queries for a certain collection, time range, execution time, etc Thursday, June 20, 13
  • 42. • Know what your tipping point looks like • Don’t switch your primary or restart • Do kill queries before the tipping point • Write your kill script before you need it • Don’t kill internal mongo operations, only queries. ... when queries pile up ... Thursday, June 20, 13
  • 43. can’t elect a master? • Never run with an even number of votes (max 7) • You need > 50% of votes to elect a primary • Set your priority levels explicitly if you need warmup • Consider delegating voting to arbiters • Set snapshot nodes to be nonvoting if possible. • Check your mongo log. Is something vetoing? Do they have an inconsistent view of the cluster state? Thursday, June 20, 13
  • 44. secondaries crashing? • Some rare mongo bugs will cause all secondaries to crash unrecoverably • Never kill oplog tailers or other internal database operations, this can also trash secondaries • Arbiters are more stable than secondaries, consider using them to form a quorum with your primary Thursday, June 20, 13
  • 45. replication stops? • Other rare bugs will stop replication or cause secondaries to exit without a corrupt op • The correct way to fix this is to re-snapshot off the primary and rebuild your secondaries. • However, you can sometimes *dangerously* repair a secondary: 1. stop mongo 2. bring it back up in standalone mode 3. repair the offending collection 4. restart mongo again as part of the replica set Thursday, June 20, 13
  • 46. • Everything is getting vaguely slower? • check your padding factor, try compaction • You rs.remove() a node and get weird driver errors? • always shut down mongod after removing from replica set • Huge background flush spike? • probably an EBS or disk problem • You run out of connection limits? • possibly a driver bug • hard-coded to 80% of soft ulimit until 20k is reached. Thursday, June 20, 13
  • 47. • It looks like all I/O stops for a while? • check your mongodb.log for large newExtent warnings • also make sure you aren’t reaching PIOPS limits • You get weird driver errors after adding/removing/ re-electing? • some drivers have problems with this, you may have to restart Thursday, June 20, 13
  • 48. Glossary of resources • Opscode AWS cookbook • https://github.com/opscode-cookbooks/aws • edelight MongoDB cookbook • https://github.com/edelight/chef-mongodb • Parse MongoDB cookbook fork • https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/ mongodb • Parse compaction scripts and warmup scripts • http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/ • http://blog.parse.com/2013/03/26/always-be-compacting/ Thursday, June 20, 13