SlideShare ist ein Scribd-Unternehmen logo
1 von 39
SCALING MONGODB IN THE CLOUD
        Simon Maynard - Bugsnag CTO
               @snmaynard
WHERE HAVE I USED MONGODB?
HEYZAP

• Largest   mobile gaming social network
• MongoDB     the main datastore
   • Also   MySQL & Redis
• High   number of reads, fewer writes
BUGSNAG
                       bugsnag.com


• Exception   tracking service for mobile and web
• MongoDB      only persistent datastore
   • Redis   caching
• Lots   of writes, fewer reads
WHAT ARE THE PROS & CONS OF
        MONGODB?
MONGODB PROS & CONS
    Pros

•   Schemaless

•   Fire & Forget

•   Scalable writes / reads

•   Fast!
MONGODB PROS & CONS
    Pros                          Cons

•   Schemaless                •   Schemaless

•   Fire & Forget             •   Fire & Forget

•   Scalable writes / reads   •   No joins

•   Fast!                     •   No transactions

                              •   Database level locking
WHEN SHOULD YOU THINK ABOUT
         SCALING?
• From    the start!

• Monitor

• Anticipate

• React   early
WHAT ARE THE KEY RESOURCES?
RAM

•   Heavily reliant on available RAM

•   “Working set” should fit in RAM

•   Indexes & documents
RAM

      in RAM




      not in RAM
I/O
•   When data is not in RAM, MongoDB hits the disk

•   Ensure this happens infrequently

•   When it does, it should be fast

•   EBS throughput sucks
HOW TO KEEP I/O FAST

•   Fast filesystem - 10gen recommends xfs

•   Use RAID - e.g. RAID 10 (stripe of mirrors)

•   Increase file descriptor limits

•   Turn off atime and diratime

•   Tweak read-ahead settings

•   http://www.mongodb.org/display/DOCS/Production+Notes
HOW CAN YOU ARCHITECT MONGODB
          TO SCALE?
VERTICAL SCALING

•   Buy more resources on single machine

    •   RAM

    •   I/O
HORIZONTAL SCALING

  •   Buy more machines

      •   Replica sets

      •   Sharding
REPLICA SETS

•   Scales reads well

•   One primary, many secondaries

•   Read from all members

•   Write to primary only

•   Inconsistent reads from secondaries
SHARDING


•   Many primaries, many secondaries

•   Scales writes and reads

•   Harder to set up well
WHAT OTHER TECHNIQUES TO SCALE?
STANDARD RULES
•   Standard DB scaling rules apply to MongoDB

    •   Use skip() and limit()

    •   Return subsets of fields

    •   Index all your queries

    •   Run explain() on new/slow queries
SCHEMA DESIGN
•   De-normalize
                   {
                       "_id" : ObjectId("505bd6a6c6b6b99254000003"),
                       "author" : "Simon Maynard",
                       "post" : "Hey everyone!",
                       "comments" : [
                       {
                         "author" : "anonymous",
                         "text" : "Hey!",
                       },{
                         "author" : "James Smith",
                         "text" : "Hey Simon!",
                       }
                   }
SCHEMA DESIGN
•   Indexes should be minimized in size and number
        {
                                                     {
            "name" : "Angry Birds",
                                                         "name" : "Angry Birds",
            "android" : true,
                                                         "platform" : 3
            "iphone" : true
                                                     }
        }
SCHEMA DESIGN
•   Minimize key lengths on small documents

•   Can reduce storage requirements and performance increase
{
    "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR",
    "game_id":"8122",
    "user_id":"1854",
    "session_start":"51067007",
    "session_end":"51067085"
}

                92 bytes
SCHEMA DESIGN
•   Minimize key lengths on small documents

•   Can reduce storage requirements and performance increase
{                                                {
    "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR",          "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR",
    "game_id":"8122",                              "g":"8122",
    "user_id":"1854",                              "u":"1854",
    "session_start":"51067007",                    "s":"51067007",
    "session_end":"51067085"                       "e":"51067085"
}                                                }

                92 bytes                                       58 bytes

                              About 1/3 memory saved!
PROFILER
•   MongoDB has a built in profiler

•   Use the profiler all the time

•   db.setProfilingLevel(1, 100)

•   ‘show profile’ shows recent profiles

•   Stored in db.system.profile
PROFILER OUTPUT
"ts" : ISODate("2012-09-24T23:24:28.908Z"),
                                                     "nscanned" : 1,
"op" : "query",
                                                     "scanAndOrder" : true,
"ns" : "bugsnag.errors",
                                                     "numYield" : 0,
"query" : {
                                                     "lockStats" : {
 "query" : {
                                                      "timeLockedMicros" : { },
   "errorHash":"2ff33b4f86543972577cdee34f60e4b2",
                                                      "timeAcquiringMicros" : {
   "project_id" :"4ff24b7e2511bb1a70000004"
                                                        "r" : NumberLong(2),
 },
                                                        "w" : NumberLong(3)
 "orderby" : {
                                                      }
   "_id" : 1
                                                     },
 }
                                                     "nreturned" : 1,
},
                                                     "responseLength" : 5240,
"ntoreturn" : 1,
                                                     "millis" : 0,
"ntoskip" : 0,
PROFILER OUTPUT
"ts" : ISODate("2012-09-24T23:24:28.908Z"),
                                                     "nscanned" : 1,
"op" : "query",
                                                     "scanAndOrder" : true,
"ns" : "bugsnag.errors",
                                                     "numYield" : 0,
"query" : {
                                                     "lockStats" : {
 "query" : {
                                                      "timeLockedMicros" : { },
   "errorHash":"2ff33b4f86543972577cdee34f60e4b2",
                                                      "timeAcquiringMicros" : {
   "project_id" :"4ff24b7e2511bb1a70000004"
                                                        "r" : NumberLong(2),
 },
                                                        "w" : NumberLong(3)
 "orderby" : {
                                                      }
   "_id" : 1
                                                     },
 }
                                                     "nreturned" : 1,
},
                                                     "responseLength" : 5240,
"ntoreturn" : 1,
                                                     "millis" : 0,
"ntoskip" : 0,
WHAT SHOULD I MONITOR?
MONITORS
•   Chart the index size

•   Chart the number of current ops

•   Monitor index misses

•   Monitor replication lag

•   Monitor I/O performance (iostat)

•   Monitor disk space
HOW CAN I MONITOR MONGODB?
db.currentOp()

{
    "opid" : 783608,
    "active" : true,
    "secs_running" : 149,
    "op" : "query",
    "ns" : "bugsnag.accounts",
    "query" : {
         "_id" : ObjectId("505bd6a6c6b6b99254000003"),
    },
    "waitingForLock" : false,
    "numYields" : 349,
}
db.serverStatus()
"locks" : {                                    ! "misses" : 0,
! "bugsnag" : {                                ! "resets" : 0,
       "timeLockedMicros" : {                  ! "missRatio" : 0
           ! "r" : NumberLong(1639187950),     }
           ! "w" : NumberLong(1313312267)    },
      },                                     "opcounters" : {
      "timeAcquiringMicros" : {              ! "insert" : 13674147,
           "r" : NumberLong(1041368094),     ! "query" : 5261723,
            "w" : NumberLong(630905947)      ! "update" : 2576757,
      }                                      ! "delete" : 22324,
   },                                        ! "getmore" : 4459,
},                                           ! "command" : 4382007
"indexCounters" : {                          },
! "btree" : {
       "accesses" : 610645909,
   ! "hits" : 610645909,
db.stats()
{
!   "db" : "bugsnag",
!   "collections" : 14,
!   "objects" : 68081951,
!   "avgObjSize" : 10147.85350585104,
!   "dataSize" : 690885147618,
!   "storageSize" : 1290028235245,
!   "numExtents" : 67,
!   "indexes" : 28,
!   "indexSize" : 21240430449,
!   "fileSize" : 1925185536051,
!   "nsSizeMB" : 16,
!   "ok" : 1
}
MONGOTOP

         ns               total   read   write

   bugsnag.events          80ms   12ms   68ms

   bugsnag.projects        2ms    2ms    0ms

    bugsnag.users          1ms    1ms    0ms

bugsnag.system.indexes     4ms    4ms    0ms
MONGOSTAT

        insert query update delete getmore command flushes faults locked db idx miss %


localhost 147   210    51     13      4       215       0     0      14%        0
MONGO MONITORING SERVICE


•   MMS is 10gen hosted Mongo monitoring

•   Available as web app (https://mms.10gen.com)

•   Android client also available from Google Play
KIBANA & LOGSTASH




•   Logstash is open-source log parser - http://logstash.net/

•   Kibana is an alternative UI for Logstash - http://kibana.org/

•   Cool trend analysis for mongo logs
•   Questions?

•   Check out www.bugsnag.com

•   Follow me on twitter @snmaynard

Weitere ähnliche Inhalte

Was ist angesagt?

MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB
 
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!async_io
 
Metaprogramming with JavaScript
Metaprogramming with JavaScriptMetaprogramming with JavaScript
Metaprogramming with JavaScriptTimur Shemsedinov
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...Andrey Devyatkin
 
Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Derrick Isaacson
 
DEF CON 23 - amit ashbel and maty siman - game of hacks
DEF CON 23 - amit ashbel and maty siman - game of hacks DEF CON 23 - amit ashbel and maty siman - game of hacks
DEF CON 23 - amit ashbel and maty siman - game of hacks Felipe Prado
 
Top Ten Web Defenses - DefCamp 2012
Top Ten Web Defenses  - DefCamp 2012Top Ten Web Defenses  - DefCamp 2012
Top Ten Web Defenses - DefCamp 2012DefCamp
 
MongoDB全機能解説1
MongoDB全機能解説1MongoDB全機能解説1
MongoDB全機能解説1Takahiro Inoue
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSMongoDB
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationMongoDB
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
 
20121023 mongodb schema-design
20121023 mongodb schema-design20121023 mongodb schema-design
20121023 mongodb schema-designMongoDB
 
Metarhia: Node.js Macht Frei
Metarhia: Node.js Macht FreiMetarhia: Node.js Macht Frei
Metarhia: Node.js Macht FreiTimur Shemsedinov
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB
 
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From RubyCache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From RubyMolly Struve
 
Javascript Object Signing & Encryption
Javascript Object Signing & EncryptionJavascript Object Signing & Encryption
Javascript Object Signing & EncryptionAaron Zauner
 

Was ist angesagt? (20)

MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
 
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
 
Metaprogramming with JavaScript
Metaprogramming with JavaScriptMetaprogramming with JavaScript
Metaprogramming with JavaScript
 
Performance patterns
Performance patternsPerformance patterns
Performance patterns
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
 
Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015
 
DEF CON 23 - amit ashbel and maty siman - game of hacks
DEF CON 23 - amit ashbel and maty siman - game of hacks DEF CON 23 - amit ashbel and maty siman - game of hacks
DEF CON 23 - amit ashbel and maty siman - game of hacks
 
Top Ten Web Defenses - DefCamp 2012
Top Ten Web Defenses  - DefCamp 2012Top Ten Web Defenses  - DefCamp 2012
Top Ten Web Defenses - DefCamp 2012
 
Couchdb w Ruby'm
Couchdb w Ruby'mCouchdb w Ruby'm
Couchdb w Ruby'm
 
MongoDB全機能解説1
MongoDB全機能解説1MongoDB全機能解説1
MongoDB全機能解説1
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and Evaluation
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
 
20121023 mongodb schema-design
20121023 mongodb schema-design20121023 mongodb schema-design
20121023 mongodb schema-design
 
Metarhia: Node.js Macht Frei
Metarhia: Node.js Macht FreiMetarhia: Node.js Macht Frei
Metarhia: Node.js Macht Frei
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
 
Cache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From RubyCache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
 
Javascript Object Signing & Encryption
Javascript Object Signing & EncryptionJavascript Object Signing & Encryption
Javascript Object Signing & Encryption
 

Ähnlich wie Mongo scaling

Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewAntonio Pintus
 
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte RangeScaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte RangeMongoDB
 
Webinar: Position and Trade Management with MongoDB
Webinar: Position and Trade Management with MongoDBWebinar: Position and Trade Management with MongoDB
Webinar: Position and Trade Management with MongoDBMongoDB
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
MongoDB Live Hacking
MongoDB Live HackingMongoDB Live Hacking
MongoDB Live HackingTobias Trelle
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Gera Shegalov
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsIgor Donchovski
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
Introduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleIntroduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleMapR Technologies
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 
20110514 mongo dbチューニング
20110514 mongo dbチューニング20110514 mongo dbチューニング
20110514 mongo dbチューニングYuichi Matsuo
 
Data as Documents: Overview and intro to MongoDB
Data as Documents: Overview and intro to MongoDBData as Documents: Overview and intro to MongoDB
Data as Documents: Overview and intro to MongoDBMitch Pirtle
 
Multithreading and Parallelism on iOS [MobOS 2013]
 Multithreading and Parallelism on iOS [MobOS 2013] Multithreading and Parallelism on iOS [MobOS 2013]
Multithreading and Parallelism on iOS [MobOS 2013]Kuba Břečka
 
Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at IntalioAntoine Toulme
 
MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FSMongoDB
 

Ähnlich wie Mongo scaling (20)

Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
CouchDB introduction
CouchDB introductionCouchDB introduction
CouchDB introduction
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte RangeScaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
 
Webinar: Position and Trade Management with MongoDB
Webinar: Position and Trade Management with MongoDBWebinar: Position and Trade Management with MongoDB
Webinar: Position and Trade Management with MongoDB
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
MongoDB Live Hacking
MongoDB Live HackingMongoDB Live Hacking
MongoDB Live Hacking
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica Sets
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
Introduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleIntroduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scale
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
nodecalgary1
nodecalgary1nodecalgary1
nodecalgary1
 
20110514 mongo dbチューニング
20110514 mongo dbチューニング20110514 mongo dbチューニング
20110514 mongo dbチューニング
 
Data as Documents: Overview and intro to MongoDB
Data as Documents: Overview and intro to MongoDBData as Documents: Overview and intro to MongoDB
Data as Documents: Overview and intro to MongoDB
 
Multithreading and Parallelism on iOS [MobOS 2013]
 Multithreading and Parallelism on iOS [MobOS 2013] Multithreading and Parallelism on iOS [MobOS 2013]
Multithreading and Parallelism on iOS [MobOS 2013]
 
Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at Intalio
 
MongoDB in FS
MongoDB in FSMongoDB in FS
MongoDB in FS
 

Kürzlich hochgeladen

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 

Kürzlich hochgeladen (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Mongo scaling

  • 1. SCALING MONGODB IN THE CLOUD Simon Maynard - Bugsnag CTO @snmaynard
  • 2. WHERE HAVE I USED MONGODB?
  • 3. HEYZAP • Largest mobile gaming social network • MongoDB the main datastore • Also MySQL & Redis • High number of reads, fewer writes
  • 4. BUGSNAG bugsnag.com • Exception tracking service for mobile and web • MongoDB only persistent datastore • Redis caching • Lots of writes, fewer reads
  • 5. WHAT ARE THE PROS & CONS OF MONGODB?
  • 6. MONGODB PROS & CONS Pros • Schemaless • Fire & Forget • Scalable writes / reads • Fast!
  • 7. MONGODB PROS & CONS Pros Cons • Schemaless • Schemaless • Fire & Forget • Fire & Forget • Scalable writes / reads • No joins • Fast! • No transactions • Database level locking
  • 8. WHEN SHOULD YOU THINK ABOUT SCALING?
  • 9. • From the start! • Monitor • Anticipate • React early
  • 10. WHAT ARE THE KEY RESOURCES?
  • 11. RAM • Heavily reliant on available RAM • “Working set” should fit in RAM • Indexes & documents
  • 12. RAM in RAM not in RAM
  • 13. I/O • When data is not in RAM, MongoDB hits the disk • Ensure this happens infrequently • When it does, it should be fast • EBS throughput sucks
  • 14. HOW TO KEEP I/O FAST • Fast filesystem - 10gen recommends xfs • Use RAID - e.g. RAID 10 (stripe of mirrors) • Increase file descriptor limits • Turn off atime and diratime • Tweak read-ahead settings • http://www.mongodb.org/display/DOCS/Production+Notes
  • 15. HOW CAN YOU ARCHITECT MONGODB TO SCALE?
  • 16. VERTICAL SCALING • Buy more resources on single machine • RAM • I/O
  • 17. HORIZONTAL SCALING • Buy more machines • Replica sets • Sharding
  • 18. REPLICA SETS • Scales reads well • One primary, many secondaries • Read from all members • Write to primary only • Inconsistent reads from secondaries
  • 19. SHARDING • Many primaries, many secondaries • Scales writes and reads • Harder to set up well
  • 21. STANDARD RULES • Standard DB scaling rules apply to MongoDB • Use skip() and limit() • Return subsets of fields • Index all your queries • Run explain() on new/slow queries
  • 22. SCHEMA DESIGN • De-normalize { "_id" : ObjectId("505bd6a6c6b6b99254000003"), "author" : "Simon Maynard", "post" : "Hey everyone!", "comments" : [ { "author" : "anonymous", "text" : "Hey!", },{ "author" : "James Smith", "text" : "Hey Simon!", } }
  • 23. SCHEMA DESIGN • Indexes should be minimized in size and number { { "name" : "Angry Birds", "name" : "Angry Birds", "android" : true, "platform" : 3 "iphone" : true } }
  • 24. SCHEMA DESIGN • Minimize key lengths on small documents • Can reduce storage requirements and performance increase { "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR", "game_id":"8122", "user_id":"1854", "session_start":"51067007", "session_end":"51067085" } 92 bytes
  • 25. SCHEMA DESIGN • Minimize key lengths on small documents • Can reduce storage requirements and performance increase { { "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR", "_id":"AHAHSPGPGSAVKLPAPHSVGKSALR", "game_id":"8122", "g":"8122", "user_id":"1854", "u":"1854", "session_start":"51067007", "s":"51067007", "session_end":"51067085" "e":"51067085" } } 92 bytes 58 bytes About 1/3 memory saved!
  • 26. PROFILER • MongoDB has a built in profiler • Use the profiler all the time • db.setProfilingLevel(1, 100) • ‘show profile’ shows recent profiles • Stored in db.system.profile
  • 27. PROFILER OUTPUT "ts" : ISODate("2012-09-24T23:24:28.908Z"), "nscanned" : 1, "op" : "query", "scanAndOrder" : true, "ns" : "bugsnag.errors", "numYield" : 0, "query" : { "lockStats" : { "query" : { "timeLockedMicros" : { }, "errorHash":"2ff33b4f86543972577cdee34f60e4b2", "timeAcquiringMicros" : { "project_id" :"4ff24b7e2511bb1a70000004" "r" : NumberLong(2), }, "w" : NumberLong(3) "orderby" : { } "_id" : 1 }, } "nreturned" : 1, }, "responseLength" : 5240, "ntoreturn" : 1, "millis" : 0, "ntoskip" : 0,
  • 28. PROFILER OUTPUT "ts" : ISODate("2012-09-24T23:24:28.908Z"), "nscanned" : 1, "op" : "query", "scanAndOrder" : true, "ns" : "bugsnag.errors", "numYield" : 0, "query" : { "lockStats" : { "query" : { "timeLockedMicros" : { }, "errorHash":"2ff33b4f86543972577cdee34f60e4b2", "timeAcquiringMicros" : { "project_id" :"4ff24b7e2511bb1a70000004" "r" : NumberLong(2), }, "w" : NumberLong(3) "orderby" : { } "_id" : 1 }, } "nreturned" : 1, }, "responseLength" : 5240, "ntoreturn" : 1, "millis" : 0, "ntoskip" : 0,
  • 29. WHAT SHOULD I MONITOR?
  • 30. MONITORS • Chart the index size • Chart the number of current ops • Monitor index misses • Monitor replication lag • Monitor I/O performance (iostat) • Monitor disk space
  • 31. HOW CAN I MONITOR MONGODB?
  • 32. db.currentOp() { "opid" : 783608, "active" : true, "secs_running" : 149, "op" : "query", "ns" : "bugsnag.accounts", "query" : { "_id" : ObjectId("505bd6a6c6b6b99254000003"), }, "waitingForLock" : false, "numYields" : 349, }
  • 33. db.serverStatus() "locks" : { ! "misses" : 0, ! "bugsnag" : { ! "resets" : 0, "timeLockedMicros" : { ! "missRatio" : 0 ! "r" : NumberLong(1639187950), } ! "w" : NumberLong(1313312267) }, }, "opcounters" : { "timeAcquiringMicros" : { ! "insert" : 13674147, "r" : NumberLong(1041368094), ! "query" : 5261723, "w" : NumberLong(630905947) ! "update" : 2576757, } ! "delete" : 22324, }, ! "getmore" : 4459, }, ! "command" : 4382007 "indexCounters" : { }, ! "btree" : { "accesses" : 610645909, ! "hits" : 610645909,
  • 34. db.stats() { ! "db" : "bugsnag", ! "collections" : 14, ! "objects" : 68081951, ! "avgObjSize" : 10147.85350585104, ! "dataSize" : 690885147618, ! "storageSize" : 1290028235245, ! "numExtents" : 67, ! "indexes" : 28, ! "indexSize" : 21240430449, ! "fileSize" : 1925185536051, ! "nsSizeMB" : 16, ! "ok" : 1 }
  • 35. MONGOTOP ns total read write bugsnag.events 80ms 12ms 68ms bugsnag.projects 2ms 2ms 0ms bugsnag.users 1ms 1ms 0ms bugsnag.system.indexes 4ms 4ms 0ms
  • 36. MONGOSTAT insert query update delete getmore command flushes faults locked db idx miss % localhost 147 210 51 13 4 215 0 0 14% 0
  • 37. MONGO MONITORING SERVICE • MMS is 10gen hosted Mongo monitoring • Available as web app (https://mms.10gen.com) • Android client also available from Google Play
  • 38. KIBANA & LOGSTASH • Logstash is open-source log parser - http://logstash.net/ • Kibana is an alternative UI for Logstash - http://kibana.org/ • Cool trend analysis for mongo logs
  • 39. Questions? • Check out www.bugsnag.com • Follow me on twitter @snmaynard

Hinweis der Redaktion

  1. \n
  2. \n
  3. All user activity stored in mongo - checkins, game usernames, etc\nHeyzap SDK in many top tier titles - lots of events. Analytics for the millions of game sessions involving heyzap SDK\nGeospatial queries to find where people checked in\nSupplement Mongo with MySQL (allows you to do joins etc)\nAlso Redis as a caching layer\n
  4. High burst write. People deploy bad code and we get all their exceptions.\nBugsnag uses Mongo and Redis alone. Redis caching layer on top of mongo\n\n\n
  5. \n
  6. Schemaless - No migrations. Migrating SQL caused a lot of downtime for Heyzap. \nFire & Forget - by default mongo doesnt wait for the write to complete before returning to the app.\n\n
  7. Many pros are also cons. Know what you are getting into.\nSchemaless means the app has to cope with bad data/migrations/bad states etc\nFire & Forget you can use the safe keyword, but that affects speed\nNo joins, can only pull data from one collection at a time\nSingle write lock across a database. Not great for high proportion of writes, but writes yield - mitigate with db per collection in 2.2. 2.4 will have collection locks.\n
  8. \n
  9. You should design with performance in mind. Think future proof.\nWork out where your pain points will be\nBegin to scale before you hit 95% capacity. You need spare capacity to scale.\n
  10. \n
  11. Working set = often used data. In logging app it would be the last n days of logs. 99% of queries would be on that.\nIndexes and documents should be in RAM for best results. Bare minimum is indexes!\n
  12. When RAM gets full! This is no exaggeration. Mongo’s performance drops massively\n
  13. For Heyzap I/O is the single biggest headache on EC2. EBS random spikes. \nHeyzap moved to provisioned IOPS when it was released to smooth the spikes, rather than get better throughput.\n
  14. xfs supports io suspend and write-cache flushing - essential for AWS snapshots\nincrease file descriptors to allow more open files\natime updates access times for files. That turns reads into writes = bad\nread-ahead means system will read extra blocks from disk when doing a read. Good for sequential access, bad for random (mongo) access\n
  15. \n
  16. Bigger machine.\nHard to get more on 1 machine, especially in the cloud.\nCan be viable in the short term. You can do this with no downtime. Heyzap & Bugsnag do\n
  17. \n
  18. If you use replica sets - monitor the replication lag. This should be close to zero. Otherwise users can write something but cant read it back.\nYou can send a “Write Concern” to say replicate to slaves. Can screw you if slaves are behind.\nAll working set still in memory on each member, just scales volume of reads, not data size\n
  19. Can automatically shard, mongo supports that. Carefully pick your shard key to correctly distribute the load across shards.\nDistributes working set across all shards for big working sets. Also distributes writes.\nHeyzap did manual sharding by collection.\n
  20. \n
  21. Only returning what you need will be faster.\nI advise ensuring (on large datasets) that pretty much every query is indexed. Cron jobs running unindexed queries have caused Heyzap downtime. Smaller datasets is fine.\nRun explain on a new query you are about to deploy. Saves a lot of downtime! Verify it uses an index.\n
  22. Means we dont have to read as many documents, which means we dont need to seek as much on disk.\nNot always applicable. Sometimes the same doc will be in too many diff places. Would make updates too hard.\n
  23. If we wanted to index here on android and iphone separately. That would be 2 indexes.\nWe can combine them into one “bitfield”, halving our index size. Heyzap had a very similar issue with schema.\nMeans we can use less RAM. #1 rule in mongo, use less RAM\n
  24. \n
  25. Depends how small your values/documents are as to whether its worth it\nCan reduce your working set - commonly accessed documents smaller.\nNo effect on indexes\n
  26. Small performance hit from using the profile is worth it. You need to know how fast your db is running.\nIn mongo (command line) run db.setProfilingLevel(1,100). Logs all queries that took more than 100ms.\nprofile is capped collection. May need resize depending on your throughput.\n
  27. Sample output of profiler.\n
  28. ts = when it ran. Tie that to your other logs\nnscanned = number of indexes or documents scanned\nscanAndOrder = when mongo cant use the index to sort\nnumYield = how many times it yielded, indication of page fault etc\nmillis = total duration\n
  29. \n
  30. Index size graphing will allow you to predict scaling needs. Heyzap could accurately predict to within ~ day\nCurrent Ops spikes show you when to look at profiler\nIndexes should rarely miss.\nReplication lag leads to bunk user experience on reads, and hard app code (read from primary).\n
  31. \n
  32. opid = opid - Pass this to db.killOp() to stop it\nns = namespace = database.collection\nCan show you why everything has suddenly gone slow, but you can miss the guilty query, profiler is better\n
  33. Locks are the microsecond duration locked and waiting for locks\nindex counters say how many index hits we had. Miss means index not in RAM = bad.\n
  34. Useful stats. Index size - keep in RAM\nGraph index size.\nThese metrics can help you predict the need for scaling\nCan also call db.collection.stats(). Get something similar\n\n
  35. Can use --locks to show you lock statistics if you prefer that view\nGood to check if you aren’t sure what collections are heavily used\n
  36. \n
  37. \n
  38. \n
  39. \n