SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Advanced 
Benchmarking 
at Parse 
Travis Redman 
Parse + Facebook
Parse? 
• Parse is a backend service for mobile apps 
• Data Storage 
• Server-side code 
• Push Notifications 
• Analytics 
• … all by dropping an SDK into your app
Parse Stats 
• Parse has 400,000 apps 
• Rapidly growing MongoDB deployment with: 
• 500 databases 
• 2.5M collections 
• 8M indexes 
• 50T storage (excluding replication) 
• We have all kinds of workloads!
Variety is Fun 
• We support just about any kind workload you can 
imagine 
• Games, social networking, events, travel, music, etc 
• Apps that are read heavy or write heavy 
• Heavy push users (time sensitive notifications) 
• Apps that store large objects 
• Apps that use us for backups 
• Inefficient queries
2.6 - Why Upgrade? 
• General desire to stay current, precursor 
for 2.8 and pluggable storage engines 
• Specific features in 2.6 
• Background indexing on secondaries 
• Index intersection 
• query plan summary logging
Upgrading is Scary 
• In the early days, we just upgraded 
• Put a new version on a secondary 
• ??? 
• Upgrade primaries 
• ??? 
• Fix bugs as we find them - LIVE!
Upgrading 
• We’re too big now to cowboy it up 
• Upgrading blindly is a potential catastrophe 
• In particular, we want to avoid: 
• Significant performance regressions 
• Unexpected bugs that break customer 
apps
Benchmarking 
• We know that: 
• Benchmarking can detect performance 
regressions between versions 
• Tools and sample workloads (sysbench, YCSB, 
…) already exist 
• MongoDB runs its own benchmarks 
• Our workload is complex - we want more 
confidence
A Customized Approach 
• Why not test with production 
workloads? 
• Flashback: https://github.com/ 
ParsePlatform/flashback 
• Record - python tool to record ops 
• Replay - go tool to play back ops
Record 
• Record leverages mongo’s profiling and oplog 
• Profiling is enabled on all DBs 
• Inserts are collected from the oplog 
• All other ops taken from profile db 
• Ops are recorded for specified time period 
(24H) and then merged 
• Produces a JSON file of ops to feed the replay 
tool
Recording
Base Snapshot 
• Need to replay prod ops on prod data 
• It’s best to play back ops on a consistent copy of the data, 
otherwise: 
• inserts are duplicate key errors 
• deletes are no-ops 
• queries don’t return the right data 
• Using EBS snapshots, we grab a copy of the db during the 
recording 
• Discard ops before the snapshot
Recording Timeline
Base Snapshot 
• Snapshot is restored to our benchmark server(s) 
• EBS volume has to be “warmed” because snapshot 
blocks are not instantiated 
• Multi TB volumes can take a few hours to warm 
• After warming we create an LVM snapshot 
• We can “rewind” (merge) after each playback, 
iterating faster
Playback 
1. Freeze the LVM volume 
2. Start the version of mongo being tested 
3. Adjust replay parameters 
• # workers 
• # num ops 
• timestamp to start at (when base snapshot was taken) 
4. Go! 
5. Client-side results are logged to file, server-side collected 
from monitoring tools
Playback
Our Workload 
• 24h of ops collected 
• 10M ops at a time, as fast as possible 
• 10 workers 
• No warming of RS 
• LVM snapshot reset, mongod restarted for 
each version 
• Rinse and repeat for multiple replica sets
Our Results 
2.4.10 
3061.96 ops/sec (avg)
Results 
2.6.3 
2062.69 ops/sec (avg)
Results 
• 33% loss in throughput. 
• A second workload showed a 75% drop 
in throughput 
• 3669.73 ops/sec vs 975.64 ops/sec 
• Ouch! What do we do next?
Replay Data 
2.4.10 
P99 
2.4.10 
MAX 
2.6.3 
P99 
2.6.3 
MAX 
query 18.45ms 20953ms 19.21ms 60001ms 
insert 23.5ms 6290ms 50.29ms 48837ms 
update 21.87ms 3835ms 21.79ms 48776ms 
FAM 21.99ms 6159ms 24.91ms 49254ms
Replay Data
Bug Hunt! 
• Old fashioned troubleshooting begins 
• Began isolating query patterns and collections 
with high max times 
• Reproduced issue, confirmed slowness in 2.6 
• Lots of documentation and log gathering, 
including extremely verbose QLOG 
• Started investigation with the Mongo team that ran 
several weeks
What we found 
• Basically, new query planner in 2.6 meets Parse 
auto-indexer 
• We create lots of indexes automatically 
• More indexes to score and potentially race 
• Increased likelihood of running into query 
planner bugs
Example 1 
Remove op on “Installation” 
{ "installationId": {"$ne": ? }, "appIdentifier": "?", 
"deviceToken": “?”} 
• 9M documents 
• installationId is UUID, unique value 
• "installationId": {"$ne": ? } matches most documents 
• deviceToken is a unique token identifying the device
{ "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”} 
• Three candidate indexes: 
{installationId: 1, deviceToken: 1} 
{deviceToken: 1, installationId: 1} 
{deviceToken: 1} 
• The second and third indexes are clearly better candidates 
for this query, since the device token is a simple point lookup. 
• Mongo bug where the work required to skip keys was not 
factored in to the plan ranking, causing the inefficient plan to 
sometimes tie 
• Since it’s a remove op, held the write lock for the DB 
• Fixed in: https://jira.mongodb.org/browse/SERVER-14311
Example 2 
Query on “Activity”: 
{ $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } 
• 25M documents 
• _p_project and _p_newProject are pointers to unique IDs of other objects 
• acl matches most documents 
• Four candidate indexes for this query 
{ _p_newProject: 1 } 
{ _p_project: 1 } 
{ _p_project: 1, _created_at: 1 } 
{ acl: 1 }
{ $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } 
• Query Planner would race multiple plans using indexes 
• Due to a bug, one of the raced indexes would do a full 
index scan (acl) 
• Index scan was non-yielding, tying up the lock until it had 
completed 
• Parse query killer job kills non-yielding queries after 45s 
• Query planner would fail to cache plan, and would re-run 
on next query with the same pattern 
• Fixed: https://jira.mongodb.org/browse/SERVER-15152
Example 3 
Query on “Activity”: { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: 
{ $in: [ "a", “b”, “c" ] } } } (same as previous example) 
• Usually fast, but occasionally saw high nscanned and query time > 60s 
• Since there were indexes on all fields in AND condition, this was a 
candidate for index intersection 
• planSummary: IXSCAN { _p_project: 1 }, IXSCAN 
{ _p_newProject: 1 }, IXSCAN { acl: 1.0 } 
• acl was not selective, but _p_project and _p_newProject would 
sometimes match 0 documents during race 
• intersection-based query plan would get cached, subsequent queries 
slow 
• Fixed in https://jira.mongodb.org/browse/SERVER-14961
Success? 
2.6.5 
4443.10 ops/sec 
(vs 3061.96 in 2.4.10)
Comparison 
2.4.10 
P99 
2.4.10 
MAX 
2.6.4 
P99 
2.6.4 
MAX 
2.6.5 
P99 
2.6.5 
MAX 
query 
18 
ms 
20,953 
ms 
19 
ms 
60,001 
ms 
10 
ms 
4,352 
ms 
insert 
23 
ms 
6,290 
ms 
50 
ms 
48,837 
ms 
24 
ms 
2,225 
ms 
update 
22 
ms 
3,835 
ms 
21 
ms 
48,776 
ms 
23 
ms 
4,535 
ms 
FAM 
22 
ms 
6,159 
ms 
24 
ms 
49,254 
ms 
23 
ms 
4,353 
ms
More Results 
2.4.10 2.6.5 
Ops:10M 
W:10 
3061 
ops/sec 
4443 
ops/sec 
Ops:10M 
W:250 
10666 
ops/sec 
12248 
ops/sec 
Ops:20M 
W:1000 
11735 
ops/sec 
14335 
ops/sec
What now? 
• 2.6 has a green light on performance 
• Working through functionality testing 
• Unit/integration testing catching 
majority of issues 
• Bonus: Flashback error log helping us to 
identify problems not caught by tests
Wrap Up 
• Benchmarking with something representative of your 
production workload is worth the time 
• Saved us from discovering slowness in production and 
inevitable and painful rollbacks 
• Using actual production data is even better 
• Helped us avoid new bugs 
• Learned a lot about our own service (indexing 
algorithms need some work) 
• Initial work can be reused to efficiently test future versions
Questions? 
• Flashback: https://github.com/ParsePlatform/ 
flashback 
• Links to bugs: 
• https://jira.mongodb.org/browse/SERVER-14311 
• https://jira.mongodb.org/browse/SERVER-15152 
• https://jira.mongodb.org/browse/SERVER-14961

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Splunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the messageSplunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the message
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery Talk
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
 
Webinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna MitchellWebinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna Mitchell
 
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data Analytics
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data AnalyticsStrata+Hadoop 2015 NYC End User Panel on Real-Time Data Analytics
Strata+Hadoop 2015 NYC End User Panel on Real-Time Data Analytics
 
Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"Erlang factory SF 2011 "Erlang and the big switch in social games"
Erlang factory SF 2011 "Erlang and the big switch in social games"
 
Degrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files SyndromeDegrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files Syndrome
 
Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...
Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...
Nagios Conference 2012 - Nicolas Brousse - Optimizing your Monitoring and Tre...
 
Erlang factory 2011 london
Erlang factory 2011 londonErlang factory 2011 london
Erlang factory 2011 london
 
Evolving Streaming Applications
Evolving Streaming ApplicationsEvolving Streaming Applications
Evolving Streaming Applications
 
Real World Java 9 (QCon London)
Real World Java 9 (QCon London)Real World Java 9 (QCon London)
Real World Java 9 (QCon London)
 
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...
 
Storm worker redesign
Storm worker redesignStorm worker redesign
Storm worker redesign
 
Erlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughputErlang as a cloud citizen, a fractal approach to throughput
Erlang as a cloud citizen, a fractal approach to throughput
 
ECS19 - Ingo Gegenwarth - Running Exchange in large environment
ECS19 - Ingo Gegenwarth -  Running Exchangein large environmentECS19 - Ingo Gegenwarth -  Running Exchangein large environment
ECS19 - Ingo Gegenwarth - Running Exchange in large environment
 
Shipping & Visualize Your Data With ELK
Shipping  & Visualize Your Data With ELKShipping  & Visualize Your Data With ELK
Shipping & Visualize Your Data With ELK
 
Splunk Conf 2014 - Splunking the Java Virtual Machine
Splunk Conf 2014 - Splunking the Java Virtual MachineSplunk Conf 2014 - Splunking the Java Virtual Machine
Splunk Conf 2014 - Splunking the Java Virtual Machine
 

Ähnlich wie Benchmarking at Parse

Ähnlich wie Benchmarking at Parse (20)

Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi Vončina
 
Profiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsProfiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty Details
 
CCI2018 - Benchmarking in the cloud
CCI2018 - Benchmarking in the cloudCCI2018 - Benchmarking in the cloud
CCI2018 - Benchmarking in the cloud
 
Test parallelization using Jenkins
Test parallelization using JenkinsTest parallelization using Jenkins
Test parallelization using Jenkins
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Background processing with hangfire
Background processing with hangfireBackground processing with hangfire
Background processing with hangfire
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Laying the Foundation for Ionic Platform Insights on Spark
Laying the Foundation for Ionic Platform Insights on SparkLaying the Foundation for Ionic Platform Insights on Spark
Laying the Foundation for Ionic Platform Insights on Spark
 
Database Fundamental Concepts- Series 1 - Performance Analysis
Database Fundamental Concepts- Series 1 - Performance AnalysisDatabase Fundamental Concepts- Series 1 - Performance Analysis
Database Fundamental Concepts- Series 1 - Performance Analysis
 
Seven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingSeven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch Benchmarking
 
Webinar: Best Practices for Upgrading to MongoDB 3.2
Webinar: Best Practices for Upgrading to MongoDB 3.2Webinar: Best Practices for Upgrading to MongoDB 3.2
Webinar: Best Practices for Upgrading to MongoDB 3.2
 
Norikra Recent Updates
Norikra Recent UpdatesNorikra Recent Updates
Norikra Recent Updates
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Benchmarking at Parse

  • 1. Advanced Benchmarking at Parse Travis Redman Parse + Facebook
  • 2. Parse? • Parse is a backend service for mobile apps • Data Storage • Server-side code • Push Notifications • Analytics • … all by dropping an SDK into your app
  • 3. Parse Stats • Parse has 400,000 apps • Rapidly growing MongoDB deployment with: • 500 databases • 2.5M collections • 8M indexes • 50T storage (excluding replication) • We have all kinds of workloads!
  • 4. Variety is Fun • We support just about any kind workload you can imagine • Games, social networking, events, travel, music, etc • Apps that are read heavy or write heavy • Heavy push users (time sensitive notifications) • Apps that store large objects • Apps that use us for backups • Inefficient queries
  • 5. 2.6 - Why Upgrade? • General desire to stay current, precursor for 2.8 and pluggable storage engines • Specific features in 2.6 • Background indexing on secondaries • Index intersection • query plan summary logging
  • 6. Upgrading is Scary • In the early days, we just upgraded • Put a new version on a secondary • ??? • Upgrade primaries • ??? • Fix bugs as we find them - LIVE!
  • 7. Upgrading • We’re too big now to cowboy it up • Upgrading blindly is a potential catastrophe • In particular, we want to avoid: • Significant performance regressions • Unexpected bugs that break customer apps
  • 8. Benchmarking • We know that: • Benchmarking can detect performance regressions between versions • Tools and sample workloads (sysbench, YCSB, …) already exist • MongoDB runs its own benchmarks • Our workload is complex - we want more confidence
  • 9. A Customized Approach • Why not test with production workloads? • Flashback: https://github.com/ ParsePlatform/flashback • Record - python tool to record ops • Replay - go tool to play back ops
  • 10. Record • Record leverages mongo’s profiling and oplog • Profiling is enabled on all DBs • Inserts are collected from the oplog • All other ops taken from profile db • Ops are recorded for specified time period (24H) and then merged • Produces a JSON file of ops to feed the replay tool
  • 12. Base Snapshot • Need to replay prod ops on prod data • It’s best to play back ops on a consistent copy of the data, otherwise: • inserts are duplicate key errors • deletes are no-ops • queries don’t return the right data • Using EBS snapshots, we grab a copy of the db during the recording • Discard ops before the snapshot
  • 14. Base Snapshot • Snapshot is restored to our benchmark server(s) • EBS volume has to be “warmed” because snapshot blocks are not instantiated • Multi TB volumes can take a few hours to warm • After warming we create an LVM snapshot • We can “rewind” (merge) after each playback, iterating faster
  • 15. Playback 1. Freeze the LVM volume 2. Start the version of mongo being tested 3. Adjust replay parameters • # workers • # num ops • timestamp to start at (when base snapshot was taken) 4. Go! 5. Client-side results are logged to file, server-side collected from monitoring tools
  • 17. Our Workload • 24h of ops collected • 10M ops at a time, as fast as possible • 10 workers • No warming of RS • LVM snapshot reset, mongod restarted for each version • Rinse and repeat for multiple replica sets
  • 18. Our Results 2.4.10 3061.96 ops/sec (avg)
  • 19. Results 2.6.3 2062.69 ops/sec (avg)
  • 20. Results • 33% loss in throughput. • A second workload showed a 75% drop in throughput • 3669.73 ops/sec vs 975.64 ops/sec • Ouch! What do we do next?
  • 21. Replay Data 2.4.10 P99 2.4.10 MAX 2.6.3 P99 2.6.3 MAX query 18.45ms 20953ms 19.21ms 60001ms insert 23.5ms 6290ms 50.29ms 48837ms update 21.87ms 3835ms 21.79ms 48776ms FAM 21.99ms 6159ms 24.91ms 49254ms
  • 23. Bug Hunt! • Old fashioned troubleshooting begins • Began isolating query patterns and collections with high max times • Reproduced issue, confirmed slowness in 2.6 • Lots of documentation and log gathering, including extremely verbose QLOG • Started investigation with the Mongo team that ran several weeks
  • 24. What we found • Basically, new query planner in 2.6 meets Parse auto-indexer • We create lots of indexes automatically • More indexes to score and potentially race • Increased likelihood of running into query planner bugs
  • 25. Example 1 Remove op on “Installation” { "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”} • 9M documents • installationId is UUID, unique value • "installationId": {"$ne": ? } matches most documents • deviceToken is a unique token identifying the device
  • 26. { "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”} • Three candidate indexes: {installationId: 1, deviceToken: 1} {deviceToken: 1, installationId: 1} {deviceToken: 1} • The second and third indexes are clearly better candidates for this query, since the device token is a simple point lookup. • Mongo bug where the work required to skip keys was not factored in to the plan ranking, causing the inefficient plan to sometimes tie • Since it’s a remove op, held the write lock for the DB • Fixed in: https://jira.mongodb.org/browse/SERVER-14311
  • 27. Example 2 Query on “Activity”: { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } • 25M documents • _p_project and _p_newProject are pointers to unique IDs of other objects • acl matches most documents • Four candidate indexes for this query { _p_newProject: 1 } { _p_project: 1 } { _p_project: 1, _created_at: 1 } { acl: 1 }
  • 28. { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } • Query Planner would race multiple plans using indexes • Due to a bug, one of the raced indexes would do a full index scan (acl) • Index scan was non-yielding, tying up the lock until it had completed • Parse query killer job kills non-yielding queries after 45s • Query planner would fail to cache plan, and would re-run on next query with the same pattern • Fixed: https://jira.mongodb.org/browse/SERVER-15152
  • 29. Example 3 Query on “Activity”: { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } (same as previous example) • Usually fast, but occasionally saw high nscanned and query time > 60s • Since there were indexes on all fields in AND condition, this was a candidate for index intersection • planSummary: IXSCAN { _p_project: 1 }, IXSCAN { _p_newProject: 1 }, IXSCAN { acl: 1.0 } • acl was not selective, but _p_project and _p_newProject would sometimes match 0 documents during race • intersection-based query plan would get cached, subsequent queries slow • Fixed in https://jira.mongodb.org/browse/SERVER-14961
  • 30. Success? 2.6.5 4443.10 ops/sec (vs 3061.96 in 2.4.10)
  • 31. Comparison 2.4.10 P99 2.4.10 MAX 2.6.4 P99 2.6.4 MAX 2.6.5 P99 2.6.5 MAX query 18 ms 20,953 ms 19 ms 60,001 ms 10 ms 4,352 ms insert 23 ms 6,290 ms 50 ms 48,837 ms 24 ms 2,225 ms update 22 ms 3,835 ms 21 ms 48,776 ms 23 ms 4,535 ms FAM 22 ms 6,159 ms 24 ms 49,254 ms 23 ms 4,353 ms
  • 32. More Results 2.4.10 2.6.5 Ops:10M W:10 3061 ops/sec 4443 ops/sec Ops:10M W:250 10666 ops/sec 12248 ops/sec Ops:20M W:1000 11735 ops/sec 14335 ops/sec
  • 33. What now? • 2.6 has a green light on performance • Working through functionality testing • Unit/integration testing catching majority of issues • Bonus: Flashback error log helping us to identify problems not caught by tests
  • 34. Wrap Up • Benchmarking with something representative of your production workload is worth the time • Saved us from discovering slowness in production and inevitable and painful rollbacks • Using actual production data is even better • Helped us avoid new bugs • Learned a lot about our own service (indexing algorithms need some work) • Initial work can be reused to efficiently test future versions
  • 35. Questions? • Flashback: https://github.com/ParsePlatform/ flashback • Links to bugs: • https://jira.mongodb.org/browse/SERVER-14311 • https://jira.mongodb.org/browse/SERVER-15152 • https://jira.mongodb.org/browse/SERVER-14961