SlideShare a Scribd company logo
1 of 26
Download to read offline
MongoDB Performance Tuning
with MMS
Presentation outline
By Enteros CTO
Ron Warshawsky
Ro@enteros.com
408-207-8408
Enteros, Inc.
MongoDb
2014-03-13 Enteros, Inc.
Overview
What Is MMS
MongoDB Management Service is a tool used to determine the health of a system
and identify the root cause of performance issues. We will segment this presentation
into two parts
Defining Key Metrics
Improving Performance
MongoDb
2014-03-13 Enteros, Inc.
Defining Key Metrics
We will define two types of metrics here:
MMS Metrics
Log File Metrics
MongoDb
2014-03-13 Enteros, Inc.
MMS Metrics
By examining these key metrics you can very quickly get a good picture of
what is going on inside a MongoDB system and what computing resources
(CPU, RAM, disk) are performance bottlenecks. Below is an outlook of the
MMS metrics.
PF/OP (Page
Faults/Opcounters)
CPU Time (IOWait and User)
Lock Percent and Queues
MongoDb
2014-03-13 Enteros, Inc.
MMS Metrics
PF/OP (Page Faults/Opcounters)
Between 5 and 10 page faults per second (left) compared to more than 4,000 operations per second
(right). A PF/OP of 0.001 (5 / 4000) is close enough to zero to classify as a low disk I/O requirement.
If PF/OP is…
near 0 – reads rarely require disk I/O
near 1 – reads regularly require disk I/O
greater than 1 – reads require heavy disk I/O
MongoDb
2014-03-13 Enteros, Inc.
MMS Metrics
CPU Time (IOWait and User)
CPU graphs from two different instances:
One experiencing high CPU IOWait (left) and the other experiencing high CPU User (right).
The CPU Time graph shows how the CPU cores are spending their
cycles. CPU IOWait reflects the fraction of time spent waiting for
the network or disk, while CPU User measures computation. Note
that to view CPU Time in MMS, you must also install munin.
MongoDb
2014-03-13 Enteros, Inc.
MMS Metrics
Lock Percent and Queues
Lock fluctuating with daily load (left) and corresponding queues (right)
Lock Percent and Queues tend to go hand in hand — the longer locking
operations take, the more other operations wait on them. The formation
of locks and queues isn’t necessarily cause for alarm in a healthy system,
but they are very good severity indicators when you and your app already
know things are slow.
MongoDb
2014-03-13 Enteros, Inc.
MMS Metrics
Lock Percent and Queues
Lock fluctuating with daily load (left) and corresponding queues (right)
The CPU Time graph shows how the CPU cores are spending their
cycles. CPU IOWait reflects the fraction of time spent waiting for
the network or disk, while CPU User measures computation. Note
that to view CPU Time in MMS, you must also install munin.
MongoDb
2014-03-13 Enteros, Inc.
Log File Metrics
In addition to key MMS metrics we also use MongoDB log files to discuss for
analysis.
nScanned
ScanAndOrder
nmoved
MongoDb
2014-03-13 Enteros, Inc.
Log File Metrics
In addition to key MMS metrics we also use MongoDB log files for analysis and
discuss here.
nScanned
ScanAndOrder
nmoved
MongoDb
2014-03-13 Enteros, Inc.
Log File Metrics
nScanned:
1. The number of objects the database examines to service a query.
This counts either documents or index entries, any of which may need to be obtained
from disk.
2. It is a good measure of the cost of an operation both in terms of I/O, object
manipulation, and memory use.
3. Ideally nscanned is not larger than nreturned, the number of results returned. Also
compare nscanned to the size of the collection. If nscanned is large, it is usually because
an index is not being used, or that the index was not selective enough.
4. Moving large numbers of documents into memory for these operations is a common
cause of page faults and CPU IOWait.
MongoDb
2014-03-13 Enteros, Inc.
Log File Metrics
scanAndOrder:
1. This is true when MongoDB performs an in-memory ordering operation to
satisfy the sort/orderby component of a query. For best performance, all
sort/orderby clauses should be satisfied by indexes. Sorting results at query time
can be inefficient (as compared with using an index that already contains
documents in the desired sort order), and can even block other operations if done
while holding the write lock, as with an update or findAndModify. Because sorting
requires computation, CPU User time can be expected.
MongoDb
2014-03-13 Enteros, Inc.
Log File Metrics
nmoved:
Indicates the number of documents that moved on disk during the operation. The
higher “nmoved”, the more intensive the operation. This is because when moving a
document: the new document location must be in memory, the document must be
copied, the old document’s location must be cleared, and index entries must be
updated to point to the new location.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
Improving Performance
correctly determining whether one of these
metrics is high or low depends on their
values relative to each other. Here are some
rules of thumb.
• PF/OP is high if it is greater than 0.25
(indicates that your instance is page-
faulting for roughly 25% of operations)
CPU IOWait or CPU User is high if
• It is greater than 50% or
• It is more than twice as large as the next
highest CPU metric
There are eight possible overall
assessments given these three metrics.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
Unbound
This means that either this database isn’t being used, or it is operating optimally.
Instead of relaxing, consider creating test data and a script to load-test your queries.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
CPU-Bound
In this case, page faults are not severe enough to prompt any appreciable IOWait. As
long as the CPU isn’t idle, it’s somewhat expected that User time will appear the highest.
If locking and queueing are occurring, you’re probably seeing in-memory table scans and
scanAndOrder operations that are not bottlenecking due to an abundance of RAM or a
lack of load.
Run through the following steps:
• Check your logs for operations with high nscanned or scanAndOrder during periods
of high lock/queue, and index accordingly.
• Check your queries for CPU-intensive operators like $all, $push/$pop/$addToSet, as
well as updates to large documents, and especially updates to documents with large
arrays (or large subdocument arrays).
• Obtain more and/or faster CPU cores. Importantly, if your database is write-heavy,
keep in mind that only one CPU per database can write at a time (owing to that
thread holding the write lock).
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
Disk-Bound
if you’re experiencing trouble or if this sort of activity is coming in bursts, then getting the
most out of your current disk is about finding the source of I/OWait. So, even though the
ratio of PF/OP is low, we’re interested in actual Page Faults, which could still put a
noticeable drag on performance.
Let’s look at some sources of page faults:
• Page faults from insert/update operations – The write load to a database results in
page faults as fresh extents are brought into memory to handle new or moved
documents. Adding RAM will not alleviate this cost, so consider faster disks, like SSDs,
or sharding.
• Page faults from a shifting working set – These are the page faults that can be
prevented with additional RAM or with more efficient indexing.
• Page faults from MongoDB’s internal operations – These are few in number and
cannot be prevented.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
Disk-Bound: Continued
If you’ve accounted for or the majority of your page faults, the remaining I/OWait in the
system should be coming from flushing the journal file and the background
synchronization of memory-mapped data to disk. Check writeToDataFilesMB on the
Journal Stats graph and the Background Flush Avg graph to see if spikes match the
elevated IOWait. Upgrade your disks and/or look into getting your journal on a separate
volume than your data. You can also separate databases using the directoryperdb mongod
option.
The last resort before sharding that youshould be verifying r data model is optimized for
writes. Check for collections with a large number of indexes, especially multi-key indexes,
and look for alternatives. Check your logs for the “nmoved” flag. If you find a lot of
documents are being moved during updates, examine the usePowerOf2Sizes setting for
those collections. When it comes to write-friendly data models, look for anything that
reduces the number of pages (index or data) that need to be modified when documents
are inserted or updated.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
CPU/Disk-Bound
Treat elevated IOWait and User times as independently high, and consider each
of the previous two sections. We don’t see this combination a lot but it is
definitely possible, such as in the case of a high-RAM, write-heavy cluster that is
missing an index. If you are experiencing locks and queues, use them to guide the
time ranges you focus on.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
RAM-Bound
If you are running high PF/OP without CPU IOWait, then:
• your operation counts are relatively low (indicating little to no operational activity).
• you have fantastic disks that are covering up for what would otherwise be a “RAM-Bound and
Disk-Bound” HHL case.
• you just started the database and are “warming” your cache. Wait until memory is fully utilized
and reassess.
If you are experiencing locks and queues, immediately evaluate your indexes to make sure missing
indexes are not prompting the need to page fault. Check your logs during periods of high page
faulting. Focus on operations with high nscanned values. Increase RAM if necessary, and you should
arrive at the coveted LLL case.
If you are not locking/queueing, this is technically a sustainable state, but be aware that any growth in
operation volume or data size–or even a barely-quantifiable shift in your working data set–could lead
to increasing IOWait, locks, and queues.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
RAM/CPU Bound
This scenario is similar to the RAM-Bound case above, except that high CPU User activity may be
covering up for what would otherwise be a high IOWait. Tackle the CPU portion first, then move
towards page faults, in the following steps:
1. Look for nscanned and scanAndOrder activity. On high-write clusters, consider the sheer number
of indexes in your write-collections. Maintaining indexes during inserts and updates can be
responsible for this, as can other CPU-intensive operators like $all, $push/$pop/$addToSet,
updates to large documents, and updates to documents with large arrays (or large subdocument
arrays).
2. Once the CPU User time has decreased, reconsider your case. If you have moved into HHL
territory, go to that section. If your CPU User time is still high, beef up your CPU if possible.
3. If your PF/OP is still high, be prepared to move to faster disk as soon as locks and queues begin to
occur.
4. Finally, consider sharding.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
RAM/Disk Bound
This is possibly the most common problem case, and the solution is fairly straightforward
compared to the other cases we’ve dealt with. Your activity is disk-heavy. Index to reduce
nscanned, or add RAM either by scaling your machines vertically; or scaling horizontally via
sharding.
MongoDb
2014-03-13 Enteros, Inc.
Improving Performance
BoundAllAround
The solution to this protracted situation depends on which of these three metrics is highest. In
general, we recommend tackling this as “RAM/Disk-Bound” HHL case first, because lack of RAM is
relatively easy to diagnose and many of the solutions for high CPU IOWait can contribute to CPU
User improvements.
So, check your logs for scanAndOrder and high nscanned numbers and then add indexes to reduce
them. Increase RAM if IOWait continues, and then re-evaluate your case. If you still end up here,
and if your locks and queues are still a problem, consider increasing your hardware capacity.
It could be premature to consider sharding straight from this highly protracted case. Ideally, attempt
a resolution that brings you to another case that we’ve explored, and consider sharding in that
context.
Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Database Root Cause and Spike Analysis for multi-tiered applications
Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of
Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High
Load Capture user interface visually correlates performance and system load metrics across multiple IT
production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture
makes analysis possible for the most transient database performance spikes.
Features
• Multi-threaded, high-precision performance collection engine
• Extensible, dynamically configurable, centrally controlled collection agents
• Comprehensive library of collector agents
• Cross-tier correlation
• Safe, secure agent communication
• Load-sensitive collection controller
Enteros
2014-03-13 Enteros, Inc.
Upbeat High Load Capture
Supported Infrastructure, Database, Application server, OS monitoring
Database Server OS:
Linux, Sun Solaris, HP/UX, AIX, Windows Server
Client OS:
Windows, Linux
Database:
Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB
Application Server:
Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
MongoDb
2014-03-13 Enteros, Inc.
Enteros, Inc
http://www.enteros.com
Enteros is an innovative software company specializing in
Performance Management and Load Testing Software for
Production Databases - RDBMS and NOSQL/Big Data
Enteros solutions enable IT professionals to identify
and remediate performance problems in business-
critical databases with unprecedented speed, accuracy
and scope.
Ron Warshawsky; ron@enteros.com
408-207-8408

More Related Content

What's hot

Virtual memory
Virtual memoryVirtual memory
Virtual memoryAsif Iqbal
 
Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029marangburu42
 
Managing Memory & Locks - Series 1 Memory Management
Managing  Memory & Locks - Series 1 Memory ManagementManaging  Memory & Locks - Series 1 Memory Management
Managing Memory & Locks - Series 1 Memory ManagementDAGEOP LTD
 
Virtual memory pre-final-formatting
Virtual memory pre-final-formattingVirtual memory pre-final-formatting
Virtual memory pre-final-formattingmarangburu42
 
Oaklands college: Protecting your data.
Oaklands college: Protecting your data.Oaklands college: Protecting your data.
Oaklands college: Protecting your data.JISC RSC Eastern
 
Testing pc’s performance
Testing pc’s performanceTesting pc’s performance
Testing pc’s performanceiteclearners
 
Microsoft SQL Server 2014 in memory oltp tdm white paper
Microsoft SQL Server 2014 in memory oltp tdm white paperMicrosoft SQL Server 2014 in memory oltp tdm white paper
Microsoft SQL Server 2014 in memory oltp tdm white paperDavid J Rosenthal
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliabilitybryanrandol
 

What's hot (13)

Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
Main memoryfinal
Main memoryfinalMain memoryfinal
Main memoryfinal
 
Memory managment
Memory managmentMemory managment
Memory managment
 
Os
OsOs
Os
 
Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029
 
Managing Memory & Locks - Series 1 Memory Management
Managing  Memory & Locks - Series 1 Memory ManagementManaging  Memory & Locks - Series 1 Memory Management
Managing Memory & Locks - Series 1 Memory Management
 
Virtual memory pre-final-formatting
Virtual memory pre-final-formattingVirtual memory pre-final-formatting
Virtual memory pre-final-formatting
 
Oaklands college: Protecting your data.
Oaklands college: Protecting your data.Oaklands college: Protecting your data.
Oaklands college: Protecting your data.
 
Testing pc’s performance
Testing pc’s performanceTesting pc’s performance
Testing pc’s performance
 
Microsoft SQL Server 2014 in memory oltp tdm white paper
Microsoft SQL Server 2014 in memory oltp tdm white paperMicrosoft SQL Server 2014 in memory oltp tdm white paper
Microsoft SQL Server 2014 in memory oltp tdm white paper
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
 
virtual memory
virtual memoryvirtual memory
virtual memory
 

Similar to MongoDB Performance Tuning with MMS Metrics

Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesronwarshawsky
 
The Fundamental Characteristics of Storage concepts for DBAs
The Fundamental Characteristics of Storage concepts for DBAsThe Fundamental Characteristics of Storage concepts for DBAs
The Fundamental Characteristics of Storage concepts for DBAsAlireza Kamrani
 
Database performance management
Database performance managementDatabase performance management
Database performance managementscottaver
 
Comparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsComparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsAmir Mahdi Akbari
 
Demartek lenovo s3200_sql_server_evaluation_2016-01
Demartek lenovo s3200_sql_server_evaluation_2016-01Demartek lenovo s3200_sql_server_evaluation_2016-01
Demartek lenovo s3200_sql_server_evaluation_2016-01Lenovo Data Center
 
Nfr testing(performance)
Nfr testing(performance)Nfr testing(performance)
Nfr testing(performance)Dilip Sharma
 
How do you know you really need ssd
How do you know you really need ssdHow do you know you really need ssd
How do you know you really need ssdJohn McDonald
 
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP OpsIRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP OpsIRJET Journal
 
Insiders Guide- Managing Storage Performance
Insiders Guide- Managing Storage PerformanceInsiders Guide- Managing Storage Performance
Insiders Guide- Managing Storage PerformanceDataCore Software
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replicationShahzad
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Shardinguzzal basak
 
Latency in storage
Latency in storageLatency in storage
Latency in storageAshwin Pawar
 
Introduction to Database Log Analysis
Introduction to Database Log AnalysisIntroduction to Database Log Analysis
Introduction to Database Log AnalysisAnton Chuvakin
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyAnkita Dubey
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And ScalabilityJason Ragsdale
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysqlliufabin 66688
 
Run MongoDB with Confidence: Backing up and Monitoring with MMS
Run MongoDB with Confidence: Backing up and Monitoring with MMSRun MongoDB with Confidence: Backing up and Monitoring with MMS
Run MongoDB with Confidence: Backing up and Monitoring with MMSMongoDB
 

Similar to MongoDB Performance Tuning with MMS Metrics (20)

Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategies
 
The Fundamental Characteristics of Storage concepts for DBAs
The Fundamental Characteristics of Storage concepts for DBAsThe Fundamental Characteristics of Storage concepts for DBAs
The Fundamental Characteristics of Storage concepts for DBAs
 
Database performance management
Database performance managementDatabase performance management
Database performance management
 
Comparison of In-memory Data Platforms
Comparison of In-memory Data PlatformsComparison of In-memory Data Platforms
Comparison of In-memory Data Platforms
 
Demartek lenovo s3200_sql_server_evaluation_2016-01
Demartek lenovo s3200_sql_server_evaluation_2016-01Demartek lenovo s3200_sql_server_evaluation_2016-01
Demartek lenovo s3200_sql_server_evaluation_2016-01
 
Nfr testing(performance)
Nfr testing(performance)Nfr testing(performance)
Nfr testing(performance)
 
How do you know you really need ssd
How do you know you really need ssdHow do you know you really need ssd
How do you know you really need ssd
 
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP OpsIRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
 
Insiders Guide- Managing Storage Performance
Insiders Guide- Managing Storage PerformanceInsiders Guide- Managing Storage Performance
Insiders Guide- Managing Storage Performance
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replication
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Latency in storage
Latency in storageLatency in storage
Latency in storage
 
Introduction to Database Log Analysis
Introduction to Database Log AnalysisIntroduction to Database Log Analysis
Introduction to Database Log Analysis
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your Architecture
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And Scalability
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
High Performance Mysql
High Performance MysqlHigh Performance Mysql
High Performance Mysql
 
Run MongoDB with Confidence: Backing up and Monitoring with MMS
Run MongoDB with Confidence: Backing up and Monitoring with MMSRun MongoDB with Confidence: Backing up and Monitoring with MMS
Run MongoDB with Confidence: Backing up and Monitoring with MMS
 

Recently uploaded

Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 

Recently uploaded (20)

Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 

MongoDB Performance Tuning with MMS Metrics

  • 1. MongoDB Performance Tuning with MMS Presentation outline By Enteros CTO Ron Warshawsky Ro@enteros.com 408-207-8408 Enteros, Inc.
  • 2. MongoDb 2014-03-13 Enteros, Inc. Overview What Is MMS MongoDB Management Service is a tool used to determine the health of a system and identify the root cause of performance issues. We will segment this presentation into two parts Defining Key Metrics Improving Performance
  • 3. MongoDb 2014-03-13 Enteros, Inc. Defining Key Metrics We will define two types of metrics here: MMS Metrics Log File Metrics
  • 4. MongoDb 2014-03-13 Enteros, Inc. MMS Metrics By examining these key metrics you can very quickly get a good picture of what is going on inside a MongoDB system and what computing resources (CPU, RAM, disk) are performance bottlenecks. Below is an outlook of the MMS metrics. PF/OP (Page Faults/Opcounters) CPU Time (IOWait and User) Lock Percent and Queues
  • 5. MongoDb 2014-03-13 Enteros, Inc. MMS Metrics PF/OP (Page Faults/Opcounters) Between 5 and 10 page faults per second (left) compared to more than 4,000 operations per second (right). A PF/OP of 0.001 (5 / 4000) is close enough to zero to classify as a low disk I/O requirement. If PF/OP is… near 0 – reads rarely require disk I/O near 1 – reads regularly require disk I/O greater than 1 – reads require heavy disk I/O
  • 6. MongoDb 2014-03-13 Enteros, Inc. MMS Metrics CPU Time (IOWait and User) CPU graphs from two different instances: One experiencing high CPU IOWait (left) and the other experiencing high CPU User (right). The CPU Time graph shows how the CPU cores are spending their cycles. CPU IOWait reflects the fraction of time spent waiting for the network or disk, while CPU User measures computation. Note that to view CPU Time in MMS, you must also install munin.
  • 7. MongoDb 2014-03-13 Enteros, Inc. MMS Metrics Lock Percent and Queues Lock fluctuating with daily load (left) and corresponding queues (right) Lock Percent and Queues tend to go hand in hand — the longer locking operations take, the more other operations wait on them. The formation of locks and queues isn’t necessarily cause for alarm in a healthy system, but they are very good severity indicators when you and your app already know things are slow.
  • 8. MongoDb 2014-03-13 Enteros, Inc. MMS Metrics Lock Percent and Queues Lock fluctuating with daily load (left) and corresponding queues (right) The CPU Time graph shows how the CPU cores are spending their cycles. CPU IOWait reflects the fraction of time spent waiting for the network or disk, while CPU User measures computation. Note that to view CPU Time in MMS, you must also install munin.
  • 9. MongoDb 2014-03-13 Enteros, Inc. Log File Metrics In addition to key MMS metrics we also use MongoDB log files to discuss for analysis. nScanned ScanAndOrder nmoved
  • 10. MongoDb 2014-03-13 Enteros, Inc. Log File Metrics In addition to key MMS metrics we also use MongoDB log files for analysis and discuss here. nScanned ScanAndOrder nmoved
  • 11. MongoDb 2014-03-13 Enteros, Inc. Log File Metrics nScanned: 1. The number of objects the database examines to service a query. This counts either documents or index entries, any of which may need to be obtained from disk. 2. It is a good measure of the cost of an operation both in terms of I/O, object manipulation, and memory use. 3. Ideally nscanned is not larger than nreturned, the number of results returned. Also compare nscanned to the size of the collection. If nscanned is large, it is usually because an index is not being used, or that the index was not selective enough. 4. Moving large numbers of documents into memory for these operations is a common cause of page faults and CPU IOWait.
  • 12. MongoDb 2014-03-13 Enteros, Inc. Log File Metrics scanAndOrder: 1. This is true when MongoDB performs an in-memory ordering operation to satisfy the sort/orderby component of a query. For best performance, all sort/orderby clauses should be satisfied by indexes. Sorting results at query time can be inefficient (as compared with using an index that already contains documents in the desired sort order), and can even block other operations if done while holding the write lock, as with an update or findAndModify. Because sorting requires computation, CPU User time can be expected.
  • 13. MongoDb 2014-03-13 Enteros, Inc. Log File Metrics nmoved: Indicates the number of documents that moved on disk during the operation. The higher “nmoved”, the more intensive the operation. This is because when moving a document: the new document location must be in memory, the document must be copied, the old document’s location must be cleared, and index entries must be updated to point to the new location.
  • 14. MongoDb 2014-03-13 Enteros, Inc. Improving Performance Improving Performance correctly determining whether one of these metrics is high or low depends on their values relative to each other. Here are some rules of thumb. • PF/OP is high if it is greater than 0.25 (indicates that your instance is page- faulting for roughly 25% of operations) CPU IOWait or CPU User is high if • It is greater than 50% or • It is more than twice as large as the next highest CPU metric There are eight possible overall assessments given these three metrics.
  • 15. MongoDb 2014-03-13 Enteros, Inc. Improving Performance Unbound This means that either this database isn’t being used, or it is operating optimally. Instead of relaxing, consider creating test data and a script to load-test your queries.
  • 16. MongoDb 2014-03-13 Enteros, Inc. Improving Performance CPU-Bound In this case, page faults are not severe enough to prompt any appreciable IOWait. As long as the CPU isn’t idle, it’s somewhat expected that User time will appear the highest. If locking and queueing are occurring, you’re probably seeing in-memory table scans and scanAndOrder operations that are not bottlenecking due to an abundance of RAM or a lack of load. Run through the following steps: • Check your logs for operations with high nscanned or scanAndOrder during periods of high lock/queue, and index accordingly. • Check your queries for CPU-intensive operators like $all, $push/$pop/$addToSet, as well as updates to large documents, and especially updates to documents with large arrays (or large subdocument arrays). • Obtain more and/or faster CPU cores. Importantly, if your database is write-heavy, keep in mind that only one CPU per database can write at a time (owing to that thread holding the write lock).
  • 17. MongoDb 2014-03-13 Enteros, Inc. Improving Performance Disk-Bound if you’re experiencing trouble or if this sort of activity is coming in bursts, then getting the most out of your current disk is about finding the source of I/OWait. So, even though the ratio of PF/OP is low, we’re interested in actual Page Faults, which could still put a noticeable drag on performance. Let’s look at some sources of page faults: • Page faults from insert/update operations – The write load to a database results in page faults as fresh extents are brought into memory to handle new or moved documents. Adding RAM will not alleviate this cost, so consider faster disks, like SSDs, or sharding. • Page faults from a shifting working set – These are the page faults that can be prevented with additional RAM or with more efficient indexing. • Page faults from MongoDB’s internal operations – These are few in number and cannot be prevented.
  • 18. MongoDb 2014-03-13 Enteros, Inc. Improving Performance Disk-Bound: Continued If you’ve accounted for or the majority of your page faults, the remaining I/OWait in the system should be coming from flushing the journal file and the background synchronization of memory-mapped data to disk. Check writeToDataFilesMB on the Journal Stats graph and the Background Flush Avg graph to see if spikes match the elevated IOWait. Upgrade your disks and/or look into getting your journal on a separate volume than your data. You can also separate databases using the directoryperdb mongod option. The last resort before sharding that youshould be verifying r data model is optimized for writes. Check for collections with a large number of indexes, especially multi-key indexes, and look for alternatives. Check your logs for the “nmoved” flag. If you find a lot of documents are being moved during updates, examine the usePowerOf2Sizes setting for those collections. When it comes to write-friendly data models, look for anything that reduces the number of pages (index or data) that need to be modified when documents are inserted or updated.
  • 19. MongoDb 2014-03-13 Enteros, Inc. Improving Performance CPU/Disk-Bound Treat elevated IOWait and User times as independently high, and consider each of the previous two sections. We don’t see this combination a lot but it is definitely possible, such as in the case of a high-RAM, write-heavy cluster that is missing an index. If you are experiencing locks and queues, use them to guide the time ranges you focus on.
  • 20. MongoDb 2014-03-13 Enteros, Inc. Improving Performance RAM-Bound If you are running high PF/OP without CPU IOWait, then: • your operation counts are relatively low (indicating little to no operational activity). • you have fantastic disks that are covering up for what would otherwise be a “RAM-Bound and Disk-Bound” HHL case. • you just started the database and are “warming” your cache. Wait until memory is fully utilized and reassess. If you are experiencing locks and queues, immediately evaluate your indexes to make sure missing indexes are not prompting the need to page fault. Check your logs during periods of high page faulting. Focus on operations with high nscanned values. Increase RAM if necessary, and you should arrive at the coveted LLL case. If you are not locking/queueing, this is technically a sustainable state, but be aware that any growth in operation volume or data size–or even a barely-quantifiable shift in your working data set–could lead to increasing IOWait, locks, and queues.
  • 21. MongoDb 2014-03-13 Enteros, Inc. Improving Performance RAM/CPU Bound This scenario is similar to the RAM-Bound case above, except that high CPU User activity may be covering up for what would otherwise be a high IOWait. Tackle the CPU portion first, then move towards page faults, in the following steps: 1. Look for nscanned and scanAndOrder activity. On high-write clusters, consider the sheer number of indexes in your write-collections. Maintaining indexes during inserts and updates can be responsible for this, as can other CPU-intensive operators like $all, $push/$pop/$addToSet, updates to large documents, and updates to documents with large arrays (or large subdocument arrays). 2. Once the CPU User time has decreased, reconsider your case. If you have moved into HHL territory, go to that section. If your CPU User time is still high, beef up your CPU if possible. 3. If your PF/OP is still high, be prepared to move to faster disk as soon as locks and queues begin to occur. 4. Finally, consider sharding.
  • 22. MongoDb 2014-03-13 Enteros, Inc. Improving Performance RAM/Disk Bound This is possibly the most common problem case, and the solution is fairly straightforward compared to the other cases we’ve dealt with. Your activity is disk-heavy. Index to reduce nscanned, or add RAM either by scaling your machines vertically; or scaling horizontally via sharding.
  • 23. MongoDb 2014-03-13 Enteros, Inc. Improving Performance BoundAllAround The solution to this protracted situation depends on which of these three metrics is highest. In general, we recommend tackling this as “RAM/Disk-Bound” HHL case first, because lack of RAM is relatively easy to diagnose and many of the solutions for high CPU IOWait can contribute to CPU User improvements. So, check your logs for scanAndOrder and high nscanned numbers and then add indexes to reduce them. Increase RAM if IOWait continues, and then re-evaluate your case. If you still end up here, and if your locks and queues are still a problem, consider increasing your hardware capacity. It could be premature to consider sharding straight from this highly protracted case. Ideally, attempt a resolution that brings you to another case that we’ve explored, and consider sharding in that context.
  • 24. Enteros 2014-03-13 Enteros, Inc. Upbeat High Load Capture Database Root Cause and Spike Analysis for multi-tiered applications Enteros UpBeat High Load Capture is an software framework for database problem root cause analysis of Oracle, DB2, SQL Server, MySQL, Sybase and MongoDB database centric multi-tiered applications. High Load Capture user interface visually correlates performance and system load metrics across multiple IT production infrastructure layers. With second-by-second granularity of data analysis, High Load Capture makes analysis possible for the most transient database performance spikes. Features • Multi-threaded, high-precision performance collection engine • Extensible, dynamically configurable, centrally controlled collection agents • Comprehensive library of collector agents • Cross-tier correlation • Safe, secure agent communication • Load-sensitive collection controller
  • 25. Enteros 2014-03-13 Enteros, Inc. Upbeat High Load Capture Supported Infrastructure, Database, Application server, OS monitoring Database Server OS: Linux, Sun Solaris, HP/UX, AIX, Windows Server Client OS: Windows, Linux Database: Oracle, Microsoft SQL, IBM DB2, MySQL, Sybase, MongoDB Application Server: Oracle (BEA) WebLogic, Oracle OAS, JBOSS, IBM WAS
  • 26. MongoDb 2014-03-13 Enteros, Inc. Enteros, Inc http://www.enteros.com Enteros is an innovative software company specializing in Performance Management and Load Testing Software for Production Databases - RDBMS and NOSQL/Big Data Enteros solutions enable IT professionals to identify and remediate performance problems in business- critical databases with unprecedented speed, accuracy and scope. Ron Warshawsky; ron@enteros.com 408-207-8408