SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
MongoDB Health Tips
It’s a little different,
but not entirely new.
Keep it in RAM. Obviously.




www.flickr.com/photos/comedynose/4388430444/
How do you know?

                   >   db.stats()
                   {
                   !    "collections" : 3,
                   !    "objects" : 379970142,
                   !    "avgObjSize" : 146.4554114991488,
                   !    "dataSize" : 55648683504,           51GB
                   !    "storageSize" : 61795435008,
                   !    "numExtents" : 64,
                   !    "indexes" : 1,
                   !    "indexSize" : 21354514128,          19GB
                   !    "fileSize" : 100816388096,
                   !    "ok" : 1
                   }


http://www.flickr.com/photos/comedynose/4388430444/
Where should it go?


                                                     Should it be in
                            What?
                                                       memory?


                            Indexes                      Always


                               Data                     If you can



http://www.flickr.com/photos/comedynose/4388430444/
How you’ll know

1) Slow queries

                 Thu Oct 14 17:01:11 [conn7410] update sd.apiLog
                query: { c: "android/setDeviceToken", a: 1466, u:
                 "blah", ua: "Server Density Android" } 51926ms




www.flickr.com/photos/tonivc/2283676770/
How you’ll know

2) Timeouts

    cursor timed out (20000 ms)
How you’ll know

3) Disk i/o spikes




www.flickr.com/photos/daddo83/3406962115/
Watch your storage

1) Pre-alloc
Watch your storage

2) Sharding maxSize
Watch your storage

3) Logging
              --quiet


    db.runCommand("logRotate");


      killall -SIGUSR1 mongod
Watch your storage

4) Journaling
    david@rs2b   ~: ls -alh /mongodbdata/journal/
    total 538M
    drwxrwxr-x   2   david   david   29 Mar 20   16:50   .
    drwx------   4   david   david 4.0K Mar 13   09:50   ..
    -rw-------   1   david   david 538M Mar 20   17:00   j._862
    -rw-------   1   david   david   88 Mar 20   17:00   lsn
db.serverStatus()
db.serverStatus()

1) Used connections




www.flickr.com/photos/armchaircaver/2061231069/
db.serverStatus()

2) Available connections
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:
[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }
MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exception
Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exception
Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exception
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018
connPoolStats
>   db.runCommand("connPoolStats")
{
!   "hosts" : {
!   ! "config1:27019" : {
!   ! ! "available" : 2,
!   ! ! "created" : 6
!   ! },
!   ! "set1/rs1a:27018,rs1b:27018" : {
!   ! ! "available" : 1,
!   ! ! "created" : 249
!   ! },
        ...
!   },
!   "totalAvailable" : 5,
!   "totalCreated" : 1002,
!   "numDBClientConnection" : 3490,
!   "numAScopedConnection" : 3,
}
db.serverStatus()

3) Index counters
     "indexCounters" : {
     ! ! "btree" : {
     ! ! ! "accesses" : 15180175,
     ! ! ! "hits" : 15178725,
     ! ! ! "misses" : 1450,
     ! ! ! "resets" : 0,
     ! ! ! "missRatio" : 0.00009551932
     ! ! }
     ! },
db.serverStatus()

4) Op counters




www.flickr.com/photos/cosmic_bandita/2395369614/
db.serverStatus()

5) Background flushing




Picture is unrelated! Mmm, ice cream.
db.serverStatus()

6) Dur
rs.status()

       {
       !     "_id" : 1,
       !     "name" : "rs3b:27018",
       !     "health" : 1,
       !     "state" : 2,
       !     "stateStr" : "SECONDARY",
       !     "uptime" : 1886098,
       !     "optime" : {
       !     ! "t" : 1291252178000,
       !     ! "i" : 13
       !     },
       !     "optimeDate" : ISODate("2010-12-02T01:09:38Z"),
             "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")
       },


www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)
rs.status()

1) myState
                               Value           Meaning
                                 0   Starting up (phase 1)
                                 1   Primary
                                 2   Secondary
                                 3   Recovering
                                 4   Fatal error
                                 5   Starting up (phase 2)
                                 6   Unknown state
                                 7   Arbiter
                                 8   Down
en.wikipedia.org/wiki/State_of_matter
rs.status()

2) Optime

         "optimeDate" : ISODate("2010-12-02T01:09:38Z")




www.flickr.com/photos/robbie73/4244846566/
rs.status()

3) Heartbeat

         "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")




www.flickr.com/photos/drawblindfaith/3400981091/
mongostat
mongostat

 1) faults




Picture is unrelated! Snowmobile in Norway.
mongostat

2) locked




www.flickr.com/photos/bbusschots/4541573665/
mongostat

3) index miss




www.flickr.com/photos/gareandkitty/276471187/
mongostat

4) queues
mongostat

5) Diagnostics
Current operations
    db.currentOp();
    {
    ! ! ! "opid" : "shard1:299939199",
    ! ! ! "active" : true,
    ! ! ! "lockType" : "write",
    ! ! ! "waitingForLock" : false,
    ! ! ! "secs_running" : 15419,
    ! ! ! "op" : "remove",
    ! ! ! "ns" : "sd.metrics",
    ! ! ! "query" : {
    ! ! ! ! "accId" : 1391,
    ! ! ! ! "tA" : {
    ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z")
    ! ! ! ! }
    ! ! ! },
    ! ! ! "client" : "10.121.12.228:44426",
    ! ! ! "desc" : "conn"
    ! ! },
www.flickr.com/photos/jeffhester/2784666811/
Monitoring tools

Run yourself




    Ganglia
Monitoring tools

Server Density
Monitoring tools




www.mongomonitor.com
Recap
Recap

Keep it in RAM
Recap

Keep it in RAM

Watch your storage
Recap

Keep it in RAM

Watch your storage

db.serverStatus()

rs.status()
David Mytton

 @davidmytton

david@boxedice.com

www.mongomonitor.com

Woop Japan!

Weitere ähnliche Inhalte

Was ist angesagt?

Gemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapGemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapESUG
 
Putting a Fork in Fork (Linux Process and Memory Management)
Putting a Fork in Fork (Linux Process and Memory Management)Putting a Fork in Fork (Linux Process and Memory Management)
Putting a Fork in Fork (Linux Process and Memory Management)David Evans
 
Using the Power to Prove
Using the Power to ProveUsing the Power to Prove
Using the Power to ProveKazuho Oku
 
Kernel-Level Programming: Entering Ring Naught
Kernel-Level Programming: Entering Ring NaughtKernel-Level Programming: Entering Ring Naught
Kernel-Level Programming: Entering Ring NaughtDavid Evans
 
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013Puppet
 
What is suid, sgid and sticky bit
What is suid, sgid and sticky bit  What is suid, sgid and sticky bit
What is suid, sgid and sticky bit Meenu Chopra
 
Unix Programming with Perl
Unix Programming with PerlUnix Programming with Perl
Unix Programming with PerlKazuho Oku
 
Sparkcamp stratasingapore
Sparkcamp stratasingaporeSparkcamp stratasingapore
Sparkcamp stratasingaporeCheng Feng
 
Real Time Web with Node
Real Time Web with NodeReal Time Web with Node
Real Time Web with NodeTim Caswell
 
Version Control and Git - GitHub Workshop
Version Control and Git - GitHub WorkshopVersion Control and Git - GitHub Workshop
Version Control and Git - GitHub WorkshopAll Things Open
 
Node Powered Mobile
Node Powered MobileNode Powered Mobile
Node Powered MobileTim Caswell
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataAnne Nicolas
 
Redis — memcached on steroids
Redis — memcached on steroidsRedis — memcached on steroids
Redis — memcached on steroidsRobert Lehmann
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程Weber Tsai
 
Cisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-oneCisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-oneDefconRussia
 

Was ist angesagt? (19)

Gemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapGemtalk Systems Product Roadmap
Gemtalk Systems Product Roadmap
 
Putting a Fork in Fork (Linux Process and Memory Management)
Putting a Fork in Fork (Linux Process and Memory Management)Putting a Fork in Fork (Linux Process and Memory Management)
Putting a Fork in Fork (Linux Process and Memory Management)
 
Using the Power to Prove
Using the Power to ProveUsing the Power to Prove
Using the Power to Prove
 
Kernel-Level Programming: Entering Ring Naught
Kernel-Level Programming: Entering Ring NaughtKernel-Level Programming: Entering Ring Naught
Kernel-Level Programming: Entering Ring Naught
 
Scheduling
SchedulingScheduling
Scheduling
 
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013
 
Mscit notes
Mscit notesMscit notes
Mscit notes
 
What is suid, sgid and sticky bit
What is suid, sgid and sticky bit  What is suid, sgid and sticky bit
What is suid, sgid and sticky bit
 
Unix Programming with Perl
Unix Programming with PerlUnix Programming with Perl
Unix Programming with Perl
 
2003 December
2003 December2003 December
2003 December
 
Sparkcamp stratasingapore
Sparkcamp stratasingaporeSparkcamp stratasingapore
Sparkcamp stratasingapore
 
Real Time Web with Node
Real Time Web with NodeReal Time Web with Node
Real Time Web with Node
 
Version Control and Git - GitHub Workshop
Version Control and Git - GitHub WorkshopVersion Control and Git - GitHub Workshop
Version Control and Git - GitHub Workshop
 
Node Powered Mobile
Node Powered MobileNode Powered Mobile
Node Powered Mobile
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
 
Redis — memcached on steroids
Redis — memcached on steroidsRedis — memcached on steroids
Redis — memcached on steroids
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程
 
iCloud keychain
iCloud keychainiCloud keychain
iCloud keychain
 
Cisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-oneCisco IOS shellcode: All-in-one
Cisco IOS shellcode: All-in-one
 

Ähnlich wie Monitoring MongoDB (MongoUK)

MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingBoxed Ice
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Parallel Computing in R
Parallel Computing in RParallel Computing in R
Parallel Computing in Rmickey24
 
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009Tsukasa Oi
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBCody Ray
 
Diagnostics and Debugging
Diagnostics and DebuggingDiagnostics and Debugging
Diagnostics and DebuggingMongoDB
 
NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNoSuchCon
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speedeswcsummerschool
 
Building OpenDNS Stats
Building OpenDNS StatsBuilding OpenDNS Stats
Building OpenDNS StatsGeorge Ang
 
Diagnostics & Debugging webinar
Diagnostics & Debugging webinarDiagnostics & Debugging webinar
Diagnostics & Debugging webinarMongoDB
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...DataStax
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to DebuggersSaumil Shah
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling TwitterBlaine
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDBTakahiro Inoue
 
Scaling Twitter 12758
Scaling Twitter 12758Scaling Twitter 12758
Scaling Twitter 12758davidblum
 
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernándezSpark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernándezJ On The Beach
 

Ähnlich wie Monitoring MongoDB (MongoUK) (20)

MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueing
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Parallel Computing in R
Parallel Computing in RParallel Computing in R
Parallel Computing in R
 
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009
Stealthy Rootkit : How bad guy fools live memory forensics? - PacSec 2009
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
 
Diagnostics and Debugging
Diagnostics and DebuggingDiagnostics and Debugging
Diagnostics and Debugging
 
NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge Solution
 
Backups
BackupsBackups
Backups
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speed
 
Building OpenDNS Stats
Building OpenDNS StatsBuilding OpenDNS Stats
Building OpenDNS Stats
 
Diagnostics & Debugging webinar
Diagnostics & Debugging webinarDiagnostics & Debugging webinar
Diagnostics & Debugging webinar
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
 
Stop Monkeys Fall
Stop Monkeys FallStop Monkeys Fall
Stop Monkeys Fall
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to Debuggers
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling Twitter
 
はじめてのMongoDB
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDB
 
Scaling Twitter 12758
Scaling Twitter 12758Scaling Twitter 12758
Scaling Twitter 12758
 
Spark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and OpsSpark Streaming Tips for Devs and Ops
Spark Streaming Tips for Devs and Ops
 
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernándezSpark Streaming Tips for Devs and Ops by Fran perez y federico fernández
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernández
 

Kürzlich hochgeladen

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 

Kürzlich hochgeladen (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 

Monitoring MongoDB (MongoUK)

  • 2. It’s a little different, but not entirely new.
  • 3. Keep it in RAM. Obviously. www.flickr.com/photos/comedynose/4388430444/
  • 4. How do you know? > db.stats() { ! "collections" : 3, ! "objects" : 379970142, ! "avgObjSize" : 146.4554114991488, ! "dataSize" : 55648683504, 51GB ! "storageSize" : 61795435008, ! "numExtents" : 64, ! "indexes" : 1, ! "indexSize" : 21354514128, 19GB ! "fileSize" : 100816388096, ! "ok" : 1 } http://www.flickr.com/photos/comedynose/4388430444/
  • 5. Where should it go? Should it be in What? memory? Indexes Always Data If you can http://www.flickr.com/photos/comedynose/4388430444/
  • 6. How you’ll know 1) Slow queries Thu Oct 14 17:01:11 [conn7410] update sd.apiLog query: { c: "android/setDeviceToken", a: 1466, u: "blah", ua: "Server Density Android" } 51926ms www.flickr.com/photos/tonivc/2283676770/
  • 7. How you’ll know 2) Timeouts cursor timed out (20000 ms)
  • 8. How you’ll know 3) Disk i/o spikes www.flickr.com/photos/daddo83/3406962115/
  • 10. Watch your storage 2) Sharding maxSize
  • 11. Watch your storage 3) Logging --quiet db.runCommand("logRotate"); killall -SIGUSR1 mongod
  • 12. Watch your storage 4) Journaling david@rs2b ~: ls -alh /mongodbdata/journal/ total 538M drwxrwxr-x 2 david david 29 Mar 20 16:50 . drwx------ 4 david david 4.0K Mar 13 09:50 .. -rw------- 1 david david 538M Mar 20 17:00 j._862 -rw------- 1 david david 88 Mar 20 17:00 lsn
  • 16. Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters: [ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 } MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exception Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exception Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exception Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018
  • 17. connPoolStats > db.runCommand("connPoolStats") { ! "hosts" : { ! ! "config1:27019" : { ! ! ! "available" : 2, ! ! ! "created" : 6 ! ! }, ! ! "set1/rs1a:27018,rs1b:27018" : { ! ! ! "available" : 1, ! ! ! "created" : 249 ! ! }, ... ! }, ! "totalAvailable" : 5, ! "totalCreated" : 1002, ! "numDBClientConnection" : 3490, ! "numAScopedConnection" : 3, }
  • 18. db.serverStatus() 3) Index counters "indexCounters" : { ! ! "btree" : { ! ! ! "accesses" : 15180175, ! ! ! "hits" : 15178725, ! ! ! "misses" : 1450, ! ! ! "resets" : 0, ! ! ! "missRatio" : 0.00009551932 ! ! } ! },
  • 20. db.serverStatus() 5) Background flushing Picture is unrelated! Mmm, ice cream.
  • 22. rs.status() { ! "_id" : 1, ! "name" : "rs3b:27018", ! "health" : 1, ! "state" : 2, ! "stateStr" : "SECONDARY", ! "uptime" : 1886098, ! "optime" : { ! ! "t" : 1291252178000, ! ! "i" : 13 ! }, ! "optimeDate" : ISODate("2010-12-02T01:09:38Z"), "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") }, www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)
  • 23. rs.status() 1) myState Value Meaning 0 Starting up (phase 1) 1 Primary 2 Secondary 3 Recovering 4 Fatal error 5 Starting up (phase 2) 6 Unknown state 7 Arbiter 8 Down en.wikipedia.org/wiki/State_of_matter
  • 24. rs.status() 2) Optime "optimeDate" : ISODate("2010-12-02T01:09:38Z") www.flickr.com/photos/robbie73/4244846566/
  • 25. rs.status() 3) Heartbeat "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") www.flickr.com/photos/drawblindfaith/3400981091/
  • 27. mongostat 1) faults Picture is unrelated! Snowmobile in Norway.
  • 32. Current operations db.currentOp(); { ! ! ! "opid" : "shard1:299939199", ! ! ! "active" : true, ! ! ! "lockType" : "write", ! ! ! "waitingForLock" : false, ! ! ! "secs_running" : 15419, ! ! ! "op" : "remove", ! ! ! "ns" : "sd.metrics", ! ! ! "query" : { ! ! ! ! "accId" : 1391, ! ! ! ! "tA" : { ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z") ! ! ! ! } ! ! ! }, ! ! ! "client" : "10.121.12.228:44426", ! ! ! "desc" : "conn" ! ! }, www.flickr.com/photos/jeffhester/2784666811/
  • 35.
  • 36.
  • 37.
  • 39. Recap
  • 41. Recap Keep it in RAM Watch your storage
  • 42. Recap Keep it in RAM Watch your storage db.serverStatus() rs.status()