SlideShare a Scribd company logo
1 of 76
#MongoDBDays 
Mythbusting: Understanding 
How We Measure the 
Performance of MongoDB 
Alvin Richards 
Senior Director of Performance Engineering, MongoDB
Before we start… 
• We are going to look a lot at 
– C++ kernel code 
– Java benchmarks 
– JavaScript tests 
• And lots of charts 
• And its going to be awesome!
Measuring "Performance" 
https://www.youtube.com/watch?v=7wm-pZp_mi0
Benchmarking 
• Some common traps 
• Performance measurement & diagnosis 
• What's next
Part One 
Some Common Traps
"We all live in a house on fire, no fire department 
to call; no way out, just the upstairs window to 
look out of while the fire burns the house down 
with us trapped, locked in it." 
The Milk Train Doesn't Stop Here Anymore 
Tennessee Williams
#1 Time taken to Insert x 
Documents 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x 
Documents 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x 
Documents 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x 
Documents 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
#1 Time taken to Insert x 
Documents 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
So that looks ok, right? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis();
What are else you measuring? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis(); 
Object creation and GC 
management?
What are else you measuring? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis(); 
Object creation and GC 
management? 
Thread contention on 
nextInt()?
What are else you measuring? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis(); 
Object creation and GC 
management? 
Thread contention on 
nextInt()? 
Time to synthesize data?
What are else you measuring? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis(); 
Object creation and GC 
management? 
Thread contention on 
nextInt()? 
Time to synthesize data? 
Thread contention on 
addAndGet()?
What are else you measuring? 
long startTime = System.currentTimeMillis(); 
for (int roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
BasicDBObject doc = new BasicDBObject(); 
doc.put("_id",id); 
doc.put("k",rand.nextInt(numMaxInserts)+1); 
String cVal = "…" 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i]=doc; 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
globalInserts.addAndGet(documentsPerInsert); 
} 
long endTime = System.currentTimeMillis(); 
Object creation and GC 
management? 
Thread contention on 
nextInt()? 
Time to synthesize data? 
Thread contention on 
addAndGet()? 
Clock resolution?
Solution: Pre-Create the objects 
// Pre Create the Object outside the Loop 
BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; 
for (int i=0; i < documentsPerInsert; i++) { 
BasicDBObject doc = new BasicDBObject(); 
String cVal = "…"; 
doc.put("c",cVal); 
String padVal = "…"; 
doc.put("pad",padVal); 
aDocs[i] = doc; 
} 
Pre-create non varying 
data outside the timing 
loop 
Alternative 
• Pre-create the data in a file; load from file
Solution: Remove contention 
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread 
java.util.concurrent.ThreadLocalRandom rand; 
for (long roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
doc = aDocs[i]; 
doc.put("_id",id); 
doc.put("k", nextInt(rand, numMaxInserts)+1); 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
} 
// Maintain count outside the loop 
globalInserts.addAndGet(documentsPerInsert * roundNum); 
Remove contention 
nextInt() by making 
Thread local
Solution: Remove contention 
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread 
java.util.concurrent.ThreadLocalRandom rand; 
Remove contention 
nextInt() by making 
Thread local 
for (long roundNum = 0; roundNum < numRounds; roundNum++) { 
for (int i = 0; i < documentsPerInsert; i++) { 
id++; 
doc = aDocs[i]; 
doc.put("_id",id); 
doc.put("k", nextInt(rand, numMaxInserts)+1); 
} 
coll.insert(aDocs); 
numInserts += documentsPerInsert; 
} 
// Maintain count outside the loop 
globalInserts.addAndGet(documentsPerInsert * roundNum); 
Remove contention on 
addAndGet()
Solution: Timer resolution 
long startTime = System.currentTimeMillis(); 
… 
long endTime = System.currentTimeMillis(); 
long startTime = System.nanoTime(); 
… 
long endTime = System.nanoTime() - startTime; 
"granularity of the value 
depends on the 
underlying operating 
system and may be 
larger" 
"resolution is at least as 
good as that of 
currentTimeMillis()" 
Source 
• http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
General Principal #1 
Know what you are 
measuring
#2 Response time to return all 
results 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis();
#2 Response time to return all 
results 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis();
#2 Response time to return all 
results 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis();
#2 Response time to return all 
results 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis();
So that looks ok, right? 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis();
What are else you measuring? 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Each doc is is 4080 bytes 
on disk with powerOf2Sizes
What are else you measuring? 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Each doc is is 4080 bytes 
on disk with powerOf2Sizes 
Unrestricted predicate?
What are else you measuring? 
BasicDBObject doc = new BasicDBObject(); 
doc.put("v", str); // str is a 2k string 
for (int i=0; i < 1000; i++) { 
doc.put("_id",i); coll.insert(doc); 
} 
BasicDBObject predicate = new BasicDBObject(); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Each doc is is 4080 bytes 
on disk with powerOf2Sizes 
Unrestricted predicate? 
Measuring 
• Time to parse & 
execute query 
• Time to retrieve all 
document 
But also 
• Cost of shipping ~4MB 
data through network 
stack
Solution: Limit the projection 
BasicDBObject predicate = new BasicDBObject(); 
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); 
BasicDBObject projection = new BasicDBObject(); 
projection.put("_id", 1); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate, projection ); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Return fixed range
Solution: Limit the projection 
BasicDBObject predicate = new BasicDBObject(); 
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); 
BasicDBObject projection = new BasicDBObject(); 
projection.put("_id", 1); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate, projection ); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Return fixed range 
Only project _id
Solution: Limit the projection 
BasicDBObject predicate = new BasicDBObject(); 
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); 
BasicDBObject projection = new BasicDBObject(); 
projection.put("_id", 1); 
long startTime = System.currentTimeMillis(); 
DBCursor cur = coll.find(predicate, projection ); 
DBObject foundObj; 
while (cur.hasNext()) { 
foundObj = cur.next(); 
} 
long endTime = System.currentTimeMillis(); 
Return fixed range 
Only project _id 
Only 46k transferred 
through network stack
General Principal #2 
Measure only what you 
need to measure
Part Two 
Performance 
measurement & 
diagnosis
"Every experiment destroys some of the 
knowledge of the system which was obtained by 
previous experiments." 
The Physical Principles of the Quantum Theory (1930) 
Werner Heisenberg
Broad categories 
• Micro Benchmarks 
• Workloads
Micro benchmarks: mongo-perf
mongo-perf: goals 
• Measure 
– commands 
• Configure 
– Single mongod, ReplSet size (1 -> n), Sharding 
– Single vs. Multiple DB 
– O/S 
• Characterize 
– Throughput by thread count 
• Compare
What do you get? 
Better
What do you get? 
Measured 
improvement 
between rc0 and 
rc2 
Better
Benchmark source code 
tests.push( { name: "Commands.CountsIntIDRange", 
pre: function( collection ) { 
collection.drop(); 
for ( var i = 0; i < 1000; i++ ) { 
collection.insert( { _id : i } ); 
} 
collection.getDB().getLastError(); 
}, 
ops: [ 
{ op: "command", 
ns : "testdb", 
command : { count : "mycollection", 
query : { _id : { "$gt" : 10, "$lt" : 100 } } } } 
] } );
Benchmark source code 
tests.push( { name: "Commands.CountsIntIDRange", 
pre: function( collection ) { 
collection.drop(); 
for ( var i = 0; i < 1000; i++ ) { 
collection.insert( { _id : i } ); 
} 
collection.getDB().getLastError(); 
}, 
ops: [ 
{ op: "command", 
ns : "testdb", 
command : { count : "mycollection", 
query : { _id : { "$gt" : 10, "$lt" : 100 } } } } 
] } );
Benchmark source code 
tests.push( { name: "Commands.CountsIntIDRange", 
pre: function( collection ) { 
collection.drop(); 
for ( var i = 0; i < 1000; i++ ) { 
collection.insert( { _id : i } ); 
} 
collection.getDB().getLastError(); 
}, 
ops: [ 
{ op: "command", 
ns : "testdb", 
command : { count : "mycollection", 
query : { _id : { "$gt" : 10, "$lt" : 100 } } } } 
] } );
Benchmark source code 
tests.push( { name: "Commands.CountsIntIDRange", 
pre: function( collection ) { 
collection.drop(); 
for ( var i = 0; i < 1000; i++ ) { 
collection.insert( { _id : i } ); 
} 
collection.getDB().getLastError(); 
}, 
ops: [ 
{ op: "command", 
ns : "testdb", 
command : { count : "mycollection", 
query : { _id : { "$gt" : 10, "$lt" : 100 } } } } 
] } );
Code Change
Workloads 
• "public" workloads 
– YCSB 
– Sysbench 
• "real world" simulations 
– Inbox fan in/out 
– Message Stores 
– Content Management
Example: Bulk Load Performance 
16m Documents 
Better 
55% degradation 
2.6.0-rc1 vs 2.4.10
Ouch… where's the tree in the 
woods? 
• 2.4.10 -> 2.6.0 
– 4495 git commits
git-bisect 
• Bisect between good/bad hashes 
• git-bisect nominates a new githash 
– Build against githash 
– Re-run test 
– Confirm if this githash is good/bad 
• Rinse and repeat
Code Change - Bad Githash
Code Change - Fix
Bulk Load Performance - Fix 
Better 
11% improvement 
2.6.1 vs 2.4.10
The problem with measurement 
• Observability 
– What can you observe on the system? 
• Effect 
– What effects does the observation cause?
mtools
mtools 
• MongoDB log file analysis 
– Filter logs for operations, events 
– Response time, lock durations 
– Plot 
• https://github.com/rueckstiess/mtools
Response Times > 100ms 
Bulk Insert 2.6.0-rc0 
Ops/Sec 
Time
Response Times > 100ms 
Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 
Floor raised
Code Change – Yielding Policy
Code Change
Response Times 
Bulk Insert 2.6.0 vs 2.6.1 
Ceiling similar, lower floor 
resulting in 40% 
improvement in throughput
Secondary effects of Yield policy change 
Write lock time reduced 
Order of magnitude reduction 
of write lock duration
Unexpected side effects of 
measurement? 
> db.serverStatus() 
Yes – will cause a read lock to be acquired 
> db.serverStatus({recordStats:0}) 
No – lock is not acquired 
> mongostat 
Yes - until SERVER-14008 resolved, uses db.serverStatus()
CPU sampling 
• Get an impression of 
– Call Graphs 
– CPU time spent on node and called nodes
Setup & building with google-profiler 
> sudo apt-get install google-perftools 
> sudo apt-get install libunwind7-dev 
> scons --use-cpu-profiler mongod
Start the profiling 
> mongodb –dbpath <…> 
Note: Do not use –fork 
> mongo 
> use admin 
> db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) 
Execute some commands that you want to profile 
> db.runCommand({_cpuProfilerStop: 1})
Sample start vs. end of workload
Sample start vs. end of workload
Code change
Public Benchmarks – Not all forks are 
the same… 
• YCSB 
– https://github.com/achille/YCSB 
• sysbench-mongodb 
– https://github.com/mdcallag/sysbench-mongodb
Part Three 
And next?
"The future sucks. Change it." 
"I'm way cool Beavis, but I cannot change the 
future." 
Beavis & Butthead
What we are working on 
• mongo-perf 
– UI refactor 
– Adding more micro benchmarks (geo, sharding) 
• Workloads 
– Adding external benchmarks 
– Creating benchmarks for common use cases 
• Inbox fan in/out 
• Analytical dashboards 
• Stream / Feeds 
• Customers, Partners & Community
Here's how you can help change the 
future! 
• Got a great workload? Great benchmark? 
• Want to donate it? 
• alvin@mongodb.com
Don't be that benchmark… 
#1 Know what you are measuring 
#2 Measure only what you need to 
measure
#MongoDBDays 
Thank You 
Alvin Richards 
alvin@mongodb.com / @jonnyeight 
Senior Director of Performance Engineering, MongoDB

More Related Content

What's hot

Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
MongoDB
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
MongoDB
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
MongoDB
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
MongoDB
 
Morphia: Simplifying Persistence for Java and MongoDB
Morphia:  Simplifying Persistence for Java and MongoDBMorphia:  Simplifying Persistence for Java and MongoDB
Morphia: Simplifying Persistence for Java and MongoDB
Jeff Yemin
 
Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDB
MongoDB
 
MongoDB Indexing Constraints and Creative Schemas
MongoDB Indexing Constraints and Creative SchemasMongoDB Indexing Constraints and Creative Schemas
MongoDB Indexing Constraints and Creative Schemas
MongoDB
 

What's hot (19)

Optimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and CreativityOptimizing Slow Queries with Indexes and Creativity
Optimizing Slow Queries with Indexes and Creativity
 
Indexing
IndexingIndexing
Indexing
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
MongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB PerformanceMongoDB Europe 2016 - Debugging MongoDB Performance
MongoDB Europe 2016 - Debugging MongoDB Performance
 
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDBMap/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
 
Morphia: Simplifying Persistence for Java and MongoDB
Morphia:  Simplifying Persistence for Java and MongoDBMorphia:  Simplifying Persistence for Java and MongoDB
Morphia: Simplifying Persistence for Java and MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Elastic search 검색
Elastic search 검색Elastic search 검색
Elastic search 검색
 
Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDB
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of Things
 
MongoDB Indexing Constraints and Creative Schemas
MongoDB Indexing Constraints and Creative SchemasMongoDB Indexing Constraints and Creative Schemas
MongoDB Indexing Constraints and Creative Schemas
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 

Similar to Mythbusting: Understanding How We Measure the Performance of MongoDB

Mythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDBMythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB
 
Implement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdfImplement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdf
feelingspaldi
 
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
aathiauto
 
Questions has 4 parts.1st part Program to implement sorting algor.pdf
Questions has 4 parts.1st part Program to implement sorting algor.pdfQuestions has 4 parts.1st part Program to implement sorting algor.pdf
Questions has 4 parts.1st part Program to implement sorting algor.pdf
apexelectronices01
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010
RonnBlack
 

Similar to Mythbusting: Understanding How We Measure the Performance of MongoDB (20)

Mythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDBMythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
 
Mythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDBMythbusting: Understanding How We Measure the Performance of MongoDB
Mythbusting: Understanding How We Measure the Performance of MongoDB
 
Mythbusting: Understanding How We Measure Performance at MongoDB
Mythbusting: Understanding How We Measure Performance at MongoDBMythbusting: Understanding How We Measure Performance at MongoDB
Mythbusting: Understanding How We Measure Performance at MongoDB
 
Implement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdfImplement a function in c++ which takes in a vector of integers and .pdf
Implement a function in c++ which takes in a vector of integers and .pdf
 
Ac2
Ac2Ac2
Ac2
 
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdfDoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
DoublyList-cpp- #include -DoublyList-h- using namespace std- void Doub.pdf
 
Look Mommy, No GC! (TechDays NL 2017)
Look Mommy, No GC! (TechDays NL 2017)Look Mommy, No GC! (TechDays NL 2017)
Look Mommy, No GC! (TechDays NL 2017)
 
C++ practical
C++ practicalC++ practical
C++ practical
 
Computer Science Practical Science C++ with SQL commands
Computer Science Practical Science C++ with SQL commandsComputer Science Practical Science C++ with SQL commands
Computer Science Practical Science C++ with SQL commands
 
The Ring programming language version 1.10 book - Part 50 of 212
The Ring programming language version 1.10 book - Part 50 of 212The Ring programming language version 1.10 book - Part 50 of 212
The Ring programming language version 1.10 book - Part 50 of 212
 
Google apps script
Google apps scriptGoogle apps script
Google apps script
 
The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31
 
Questions has 4 parts.1st part Program to implement sorting algor.pdf
Questions has 4 parts.1st part Program to implement sorting algor.pdfQuestions has 4 parts.1st part Program to implement sorting algor.pdf
Questions has 4 parts.1st part Program to implement sorting algor.pdf
 
The Ring programming language version 1.7 book - Part 48 of 196
The Ring programming language version 1.7 book - Part 48 of 196The Ring programming language version 1.7 book - Part 48 of 196
The Ring programming language version 1.7 book - Part 48 of 196
 
Cnam azure 2014 mobile services
Cnam azure 2014   mobile servicesCnam azure 2014   mobile services
Cnam azure 2014 mobile services
 
greenDAO
greenDAOgreenDAO
greenDAO
 
The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210The Ring programming language version 1.9 book - Part 53 of 210
The Ring programming language version 1.9 book - Part 53 of 210
 
The Ring programming language version 1.9 book - Part 21 of 210
The Ring programming language version 1.9 book - Part 21 of 210The Ring programming language version 1.9 book - Part 21 of 210
The Ring programming language version 1.9 book - Part 21 of 210
 
Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Mythbusting: Understanding How We Measure the Performance of MongoDB

  • 1. #MongoDBDays Mythbusting: Understanding How We Measure the Performance of MongoDB Alvin Richards Senior Director of Performance Engineering, MongoDB
  • 2. Before we start… • We are going to look a lot at – C++ kernel code – Java benchmarks – JavaScript tests • And lots of charts • And its going to be awesome!
  • 4. Benchmarking • Some common traps • Performance measurement & diagnosis • What's next
  • 5. Part One Some Common Traps
  • 6. "We all live in a house on fire, no fire department to call; no way out, just the upstairs window to look out of while the fire burns the house down with us trapped, locked in it." The Milk Train Doesn't Stop Here Anymore Tennessee Williams
  • 7. #1 Time taken to Insert x Documents long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 8. #1 Time taken to Insert x Documents long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 9. #1 Time taken to Insert x Documents long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 10. #1 Time taken to Insert x Documents long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 11. #1 Time taken to Insert x Documents long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 12. So that looks ok, right? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis();
  • 13. What are else you measuring? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); Object creation and GC management?
  • 14. What are else you measuring? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); Object creation and GC management? Thread contention on nextInt()?
  • 15. What are else you measuring? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); Object creation and GC management? Thread contention on nextInt()? Time to synthesize data?
  • 16. What are else you measuring? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); Object creation and GC management? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()?
  • 17. What are else you measuring? long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); Object creation and GC management? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()? Clock resolution?
  • 18. Solution: Pre-Create the objects // Pre Create the Object outside the Loop BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i < documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject(); String cVal = "…"; doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i] = doc; } Pre-create non varying data outside the timing loop Alternative • Pre-create the data in a file; load from file
  • 19. Solution: Remove contention // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Remove contention nextInt() by making Thread local
  • 20. Solution: Remove contention // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; Remove contention nextInt() by making Thread local for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Remove contention on addAndGet()
  • 21. Solution: Timer resolution long startTime = System.currentTimeMillis(); … long endTime = System.currentTimeMillis(); long startTime = System.nanoTime(); … long endTime = System.nanoTime() - startTime; "granularity of the value depends on the underlying operating system and may be larger" "resolution is at least as good as that of currentTimeMillis()" Source • http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
  • 22. General Principal #1 Know what you are measuring
  • 23. #2 Response time to return all results BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis();
  • 24. #2 Response time to return all results BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis();
  • 25. #2 Response time to return all results BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis();
  • 26. #2 Response time to return all results BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis();
  • 27. So that looks ok, right? BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis();
  • 28. What are else you measuring? BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Each doc is is 4080 bytes on disk with powerOf2Sizes
  • 29. What are else you measuring? BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate?
  • 30. What are else you measuring? BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate? Measuring • Time to parse & execute query • Time to retrieve all document But also • Cost of shipping ~4MB data through network stack
  • 31. Solution: Limit the projection BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Return fixed range
  • 32. Solution: Limit the projection BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Return fixed range Only project _id
  • 33. Solution: Limit the projection BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Return fixed range Only project _id Only 46k transferred through network stack
  • 34. General Principal #2 Measure only what you need to measure
  • 35. Part Two Performance measurement & diagnosis
  • 36. "Every experiment destroys some of the knowledge of the system which was obtained by previous experiments." The Physical Principles of the Quantum Theory (1930) Werner Heisenberg
  • 37. Broad categories • Micro Benchmarks • Workloads
  • 39. mongo-perf: goals • Measure – commands • Configure – Single mongod, ReplSet size (1 -> n), Sharding – Single vs. Multiple DB – O/S • Characterize – Throughput by thread count • Compare
  • 40. What do you get? Better
  • 41. What do you get? Measured improvement between rc0 and rc2 Better
  • 42. Benchmark source code tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } );
  • 43. Benchmark source code tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } );
  • 44. Benchmark source code tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } );
  • 45. Benchmark source code tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } );
  • 47. Workloads • "public" workloads – YCSB – Sysbench • "real world" simulations – Inbox fan in/out – Message Stores – Content Management
  • 48. Example: Bulk Load Performance 16m Documents Better 55% degradation 2.6.0-rc1 vs 2.4.10
  • 49. Ouch… where's the tree in the woods? • 2.4.10 -> 2.6.0 – 4495 git commits
  • 50. git-bisect • Bisect between good/bad hashes • git-bisect nominates a new githash – Build against githash – Re-run test – Confirm if this githash is good/bad • Rinse and repeat
  • 51. Code Change - Bad Githash
  • 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs 2.4.10
  • 54. The problem with measurement • Observability – What can you observe on the system? • Effect – What effects does the observation cause?
  • 56. mtools • MongoDB log file analysis – Filter logs for operations, events – Response time, lock durations – Plot • https://github.com/rueckstiess/mtools
  • 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec Time
  • 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 Floor raised
  • 59. Code Change – Yielding Policy
  • 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar, lower floor resulting in 40% improvement in throughput
  • 62. Secondary effects of Yield policy change Write lock time reduced Order of magnitude reduction of write lock duration
  • 63. Unexpected side effects of measurement? > db.serverStatus() Yes – will cause a read lock to be acquired > db.serverStatus({recordStats:0}) No – lock is not acquired > mongostat Yes - until SERVER-14008 resolved, uses db.serverStatus()
  • 64. CPU sampling • Get an impression of – Call Graphs – CPU time spent on node and called nodes
  • 65. Setup & building with google-profiler > sudo apt-get install google-perftools > sudo apt-get install libunwind7-dev > scons --use-cpu-profiler mongod
  • 66. Start the profiling > mongodb –dbpath <…> Note: Do not use –fork > mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) Execute some commands that you want to profile > db.runCommand({_cpuProfilerStop: 1})
  • 67. Sample start vs. end of workload
  • 68. Sample start vs. end of workload
  • 70. Public Benchmarks – Not all forks are the same… • YCSB – https://github.com/achille/YCSB • sysbench-mongodb – https://github.com/mdcallag/sysbench-mongodb
  • 71. Part Three And next?
  • 72. "The future sucks. Change it." "I'm way cool Beavis, but I cannot change the future." Beavis & Butthead
  • 73. What we are working on • mongo-perf – UI refactor – Adding more micro benchmarks (geo, sharding) • Workloads – Adding external benchmarks – Creating benchmarks for common use cases • Inbox fan in/out • Analytical dashboards • Stream / Feeds • Customers, Partners & Community
  • 74. Here's how you can help change the future! • Got a great workload? Great benchmark? • Want to donate it? • alvin@mongodb.com
  • 75. Don't be that benchmark… #1 Know what you are measuring #2 Measure only what you need to measure
  • 76. #MongoDBDays Thank You Alvin Richards alvin@mongodb.com / @jonnyeight Senior Director of Performance Engineering, MongoDB

Editor's Notes

  1. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/Random.html "Instances of java.util.Random are threadsafe. However, the concurrent use of the same java.util.Random instance across threads may encounter contention and consequent poor performance. Consider instead using ThreadLocalRandom in multithreaded designs."
  2. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html "The specifications of these methods enable implementations to employ efficient machine-level atomic instructions that are available on contemporary processors. However on some platforms, support may entail some form of internal locking."
  3. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis() "Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds."
  4. Per Jav7 documentation http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html "A random number generator isolated to the current thread…Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools."
  5. Per Java7 documentation http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime() "This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."
  6. Githash https://github.com/mongodb/mongo/commit/d1dc7cf2b213d77103658ccd2ea4816b33a27f6a#diff-7ba76fe024c203ca35087f3b93395acc
  7. Githash https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  8. Githash https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  9. Githash https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  10. Githash https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  11. Githash https://github.com/mongodb/mongo/commit/8d43b5cb9949c16452cb8d949c89d94cab9c8bad#diff-264fb70c85a638c671570970f3752bf3