SlideShare ist ein Scribd-Unternehmen logo
1 von 73
Downloaden Sie, um offline zu lesen
MongoDB Queueing & Monitoring
•Server Density
                     •26 nodes
                  •6 replica sets
 •Primary datastore = 15 nodes
•Server Density
                                                     •+7TB / mth
                                      •+1bn docs / mth
                           •2-5k inserts/s @ 3ms

We use MongoDB as our primary data store but also as a queueing system. So I’m going to
talk rst about how we built the queuing functionality into Mongo and then more generally
about what you need to keep an eye on when monitoring MongoDB in production.
Queuing: Uses




www.flickr.com/photos/triplexpresso/496995086/
Queuing: Uses


• Background processing




www.flickr.com/photos/triplexpresso/496995086/
Queuing: Uses


• Background processing
• Sending notifications



www.flickr.com/photos/triplexpresso/496995086/
Queuing: Uses


• Background processing
• Sending notifications
• Event streaming

www.flickr.com/photos/triplexpresso/496995086/
Asynchronous
Queuing: Features
Queuing: Features

• Consumers
Queuing: Features

• Consumers
• Atomic
Queuing: Features

• Consumers
• Atomic
• Speed
Queuing: Features

• Consumers
• Atomic
• Speed
• GC
Queuing: Features


•Consumers
Queuing: Features


•Consumers
                          MongoDB                  RabbitMQ


                        Mongo Wire
                                                     AMQP
                         Protocol




If you’re building a queue connecting via - RabbitMQ AMQP. Mongo Wire
Queuing: Features


•Atomic




en.wikipedia.org/wiki/State_of_matter
Queuing: Features


•Atomic
                                   MongoDB     RabbitMQ


                               ndAndModify   consume/ack



en.wikipedia.org/wiki/State_of_matter
Queuing: Features


•Speed
Queuing: Features


•GC
Queuing: Features


•GC
      MongoDB    RabbitMQ



       ☚        consume/ack
Implementation

• Consumers




2 things we need to implement - consumers and GC
Implementation

• Consumers
        db.runCommand(
        { ndAndModify : <collection>, <options> } )




ndAndModify command takes 2 parameters - collection and options.
Implementation

• Consumers
         db.runCommand(
         { ndAndModify : <collection>, <options> } )

         query: lter (WHERE)

         { query: { inProg: false } }



Specify the query just like any normal query against Mongo. The very rst document that
matches this will be returned. Since we’re building a queuing system, we’re using a field
called inProg so we’re asking it to give us documents where this is false - i.e. the processing
of that document isnt in progress.
Implementation

• Consumers
        db.runCommand(
        { ndAndModify : <collection>, <options> } )

        update: modier object

        { update: { $set: {inProg: true, start: new
        Date()} } }


Atomic update.
Implementation

• Consumers
         db.runCommand(
         { ndAndModify : <collection>, <options> } )

         sort: selects the rst one on multi-match

         { sort: { added: -1 } }



We can also sort e.g. on a timestamp so you can return the oldest documents rst, or you
could build a priority system to return more important documents rst.
Implementation

• Consumers
   db.runCommand(
   { ndAndModify : <collection>, <options> } )

   remove: true = deletes on return
   new: true = returns modied object
   elds: return specic elds
   upsert: true = create object if !exists()
Implementation

• GC
Implementation

• GC
  now = datetime.datetime.now()
  difference = datetime.timedelta(seconds=10)
  timeout = now - difference

  queue.find({'inProg' : True, 'start' :
  {'$lte' : timeout} })
Stick with RabbitMQ?
Stick with RabbitMQ?

QoS
Stick with RabbitMQ?

QoS

AMQP
Stick with RabbitMQ?

QoS

AMQP

Throttling
It’s a little different,
                                                      but not entirely new.




The problem is that MongoDB is fairly new and whilst it’s still just another database running
on a server, there are things that are new and unusual. This means that some old
assumptions are still valid, but others aren’t. You don’t have to approach it as a completely
new thing, but it is a little different. There are disadvantages to this but one advantage is you
can use it for novel tasks, like queuing.
Keep it in RAM. Obviously.




www.flickr.com/photos/comedynose/4388430444/
The rst and most obvious thing to note is that keeping everything in RAM is faster. But what
does that actually mean and how do you know when something is in RAM?
How do you know?

                   >   db.stats()
                   {
                   !    "collections" : 3,
                   !    "objects" : 379970142,
                   !    "avgObjSize" : 146.4554114991488,
                   !    "dataSize" : 55648683504,                              51GB
                   !    "storageSize" : 61795435008,
                   !    "numExtents" : 64,
                   !    "indexes" : 1,
                   !    "indexSize" : 21354514128,                              19GB
                   !    "fileSize" : 100816388096,
                   !    "ok" : 1
                   }


http://www.flickr.com/photos/comedynose/4388430444/
The easiest way is to check the database size. The MongoDB console provides an easy way to
look at the data and index sizes, and the output is provided in bytes.
Where should it go?


                                                     Should it be in
                            What?
                                                       memory?


                            Indexes                        Always


                               Data                      If you can



http://www.flickr.com/photos/comedynose/4388430444/
In every case, having something in memory is going to be faster than not. However, that’s not
always feasible if you have massive data sets. Instead, you want to make sure you always
have enough RAM to store all the indexes, which is what the db.stats() output is for. And if
you can, have space for data too. MongoDB is smart about its memory management so it will
keep commonly accessed data in RAM where possible.
How you’ll know

1) Slow queries

                 Thu Oct 14 17:01:11 [conn7410] update sd.apiLog
                query: { c: "android/setDeviceToken", a: 1466, u:
                 "blah", ua: "Server Density Android" } 51926ms




www.flickr.com/photos/tonivc/2283676770/
Although not the only reason, a slow query does indicate insufficient memory. This might be
that you’ve not got the most optimal indexes for a query but if indexes are being used and
it’s still slow, it could be because of a disk i/o bottleneck because the data isn’t in RAM.
Doing an explain on the query will show you what indexes it is using.
How you’ll know

2) Timeouts

               cursor timed out (20000 ms)




These slow queries will obviously cause a slowdown in your app but they may also cause
timeouts. In the PHP driver a cursor will timeout after 20,000ms by default, although this is
congurable.
How you’ll know

3) Disk i/o spikes




www.flickr.com/photos/daddo83/3406962115/
You’ll see write spikes because MongoDB syncs data to disk periodically, but if you’re seeing
read spikes then that can indicate MongoDB is having to read the data les rather than
accessing data from memory. Be careful though because this won’t distinguish between data
and indexes, or even other server activity. Read spikes can also occur even if you have little
or no read activity if the mongod is part of a cluster where the slaves are reading from the
oplog.
Watch your storage

1) Pre-alloc




It sounds obvious but our statistics show that people run out disk space suddenly, even
though there is a predictable increase over time. Remember that MongoDB pre-allocates les
before the space is used, so you’ll see your storage being used up in 2GB increments (once
you go past the smaller initial data le sizes).
Watch your storage

2) Sharding maxSize




When adding a new shard you can specify the maximum amount of data you want to store on
that shard. This isn’t a hard limit and is instead used as a guide. MongoDB will try to keep the
data balanced across all your shards so that it meets this setting but it may not. MongoDB
doesn’t currently look at actual disk levels and assumes available capacity is the same across
all nodes. As such, it’s advisable that you set this to around 70% of the total available disk
space.
Watch your storage

3) Logging
                                       --quiet


               db.runCommand("logRotate");


                    killall -SIGUSR1 mongod

Logging is verbose by default, so you’ll want to use the quiet option to ensure only important
things are output. And assuming you’re logging to a log file, you will want to periodically
rotate it via the MongoDB console so that it doesn’t get too big. You can also do a killall
SIGUSR1 on all your mongod processes from the shell which will cause a log rotation
(because of the SIGUSR1 flag). This is useful if you want to script log rotation or put it into a
cron job.
Watch your storage

4) Journaling
            david@rs2b     ~: ls -alh /mongodbdata/journal/
            total 538M
            drwxrwxr-x     2   david   david   29 Mar 20     16:50   .
            drwx------     4   david   david 4.0K Mar 13     09:50   ..
            -rw-------     1   david   david 538M Mar 20     17:00   j._862
            -rw-------     1   david   david   88 Mar 20     17:00   lsn




Mongo should rotate the journal les often but you need to remember that they will take up
some space too, and as new les are allocated and old ones deleted, you may see your disk
usage spiking up and down.
db.serverStatus()




The server status command provides a lot of different statistics that can help you, like this
map of traffic in central Tokyo.
db.serverStatus()

1) Used connections




www.flickr.com/photos/armchaircaver/2061231069/
Every connection to the database has an overhead. You want to reduce this number by using
persistent connections through the drivers.
db.serverStatus()

2) Available connections




Every server has its limits. If you run out of available connections then you’ll have a problem,
which will look like this in the logs.
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:
[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }
MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exception
Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exception
Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exception
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname
We’ve recently had this problem and it manifests itself by the logs filling up all available disk
Fri Nov 19 17:24:34 [conn2343] reconnect rs2d:27018 failed
space instantly, and in some cases completely crashing the server.
Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2c:27018
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] reconnect rs2c:27018 failed
Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2a:27018
Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname
Fri Nov 19 17:24:34 [conn2343] reconnect rs2a:27018 failed
Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)
Fri Nov 19 17:24:35 [conn2343] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:
[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }
MessagingPort say send() errno:9 Bad file descriptor (NONE)
connPoolStats
                  >   db.runCommand("connPoolStats")
                  {
                  !   "hosts" : {
                  !   ! "config1:27019" : {
                  !   ! ! "available" : 2,
                  !   ! ! "created" : 6
                  !   ! },
                  !   ! "set1/rs1a:27018,rs1b:27018" : {
                  !   ! ! "available" : 1,
                  !   ! ! "created" : 249
                  !   ! },
                          ...
                  !   },
                  !   "totalAvailable" : 5,
                  !   "totalCreated" : 1002,
                  !   "numDBClientConnection" : 3490,
                  !   "numAScopedConnection" : 3,
                  }




connPoolStats allows you to see the connection pools that have been set up by a mongos to
connect to different members of the replica set shards. This is useful to correlate against
open le descriptors so you can see if there are suddenly a large number of connections, or if
there are a low number of available connections across your entire cluster.
db.serverStatus()

3) Index counters
                "indexCounters" : {
                ! ! "btree" : {
                ! ! ! "accesses" : 15180175,
                ! ! ! "hits" : 15178725,
                ! ! ! "misses" : 1450,
                ! ! ! "resets" : 0,
                ! ! ! "missRatio" : 0.00009551932
                ! ! }
                ! },


The miss ratio is what you’re looking at here. If you’re seeing a lot of index misses then you
need to look at your queries to see if they’re making optimal use of the indexes you’ve
created. You should consider adding new indexes and seeing if your queries run faster as a
result. You can use the explain syntax to see which indexes queries are hitting, and the total
execution time so you can benchmark them before and after.
db.serverStatus()

4) Op counters




www.flickr.com/photos/cosmic_bandita/2395369614/
The op counters - inserts, updates, deletes and queries - are fun to look at, especially if the
numbers are high. But you have to be careful these are not just vanity metrics. There are
some things you can use them for though. If you have a high number of inserts and updates,
i.e. writes, then you may want to look at your fsync time setting. By default this will flush to
disk every 60 seconds but if you’re doing thousands of writes per second you might want to
do this sooner for durability. Of course you can also ensure the write happens from within
the driver. Queries can show whether you need to load off reads to your slaves, which can be
done through the drivers, so that you’re spreading the load across your servers and only
writing to the master. Deletes can also cause concurrency problems if you’re doing a large
number of them and the database keeps having to yield.
db.serverStatus()

5) Background flushing




Picture is unrelated! Mmm, ice cream.
The server status output allows you to see the last time data was flushed to disk, and how
long that took. This is useful to see if you’re causing high disk load but also so you can
monitor how often data is being written. Remember that whilst it isn’t synced to disk, you
could experience data loss in the event of a crash or power outage.
db.serverStatus()

6) Dur




If you have journalling enabled then serverStatus will also show some stats such as how many
commits have occurred, the amount of data written and how long various operations have
taken. This can be useful for seeing how much overhead durability adds to servers. We’ve
found no noticeable difference when enabling journaling and that’s on servers processing
billions of operations.
rs.status()

       {
       !     "_id" : 1,
       !     "name" : "rs3b:27018",
       !     "health" : 1,
       !     "state" : 2,
       !     "stateStr" : "SECONDARY",
       !     "uptime" : 1886098,
       !     "optime" : {
       !     ! "t" : 1291252178000,
       !     ! "i" : 13
       !     },
       !     "optimeDate" : ISODate("2010-12-02T01:09:38Z"),
             "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")
       },


www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)
If you’re running a replica set then you can use the rs.status() command to get information
about the whole replica set, on any set member. This gives you a few stats about the current
member as well as a full list of every member in the set.
rs.status()

1) myState
                               Value           Meaning
                                 0   Starting up (phase 1)
                                 1   Primary
                                 2   Secondary
                                 3   Recovering
                                 4   Fatal error
                                 5   Starting up (phase 2)
                                 6   Unknown state
                                 7   Arbiter
                                 8   Down
en.wikipedia.org/wiki/State_of_matter

The rst value is myState which shows you the status of the server you executed the
command on. However, it’s also used in the list of members the command also provides so
you can see the state of any member in the replica set, as that member sees it. This is useful
to understand why members might be down because other members can’t see them.
rs.status()

2) Optime

         "optimeDate" : ISODate("2010-12-02T01:09:38Z")




www.flickr.com/photos/robbie73/4244846566/
Replica set members who are not master will be secondary, which means they’ll act as a slave
staying up to date with the master. The optimeDate allows you to see whether a member is
behind on the replication sync. The timestamp is the last applied log item so if it’s up to date,
it’ll be very close to the current actual time on the server.
rs.status()

3) Heartbeat

         "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")




www.flickr.com/photos/drawblindfaith/3400981091/
The whole idea behind replica sets is that they automate failover in the event of failure
somewhere. This is done by a regular heartbeat that all members send out to all other
members. The status output shows you the last time that particular member was contacted
from the current member. In the event of a network partition it may be that some members
can’t communicate with eachother, and when there is an error you’ll see it in this section too.
mongostat




The mongostat tool is included as part of the standard MongoDB download and gives you a
quick, real time snapshot of the current state of your servers.
mongostat

 1) faults




Picture is unrelated! Snowmobile in Norway.
The faults column shows you the number of Linux page faults per second. This is when
Mongo accesses something that is mapped to the virtual address space but not in physical
memory. i.e. it results in a read from disk. High values here indicate you may not have
enough RAM to store all necessary data and disk accesses may start to become the
bottleneck.
mongostat

2) locked




www.flickr.com/photos/bbusschots/4541573665/
The next column is locked, which shows the % of time in a global write lock. When this is
happening no other queries will complete until the lock is given up, or the lock owner yields.
This is indicative of a large, global operation like a remove() or dropping a collection and can
result in slow performance.
mongostat

3) index miss




www.flickr.com/photos/gareandkitty/276471187/
Index miss is like we saw in the server status output except instead of an aggregate total,
you can see queries hitting (or missing) the index in real time. This is useful if you’re
debugging specic queries in development or need to track down a server that is performing
badly.
mongostat

4) queues




When MongoDB gets too many queries to handle in real time, it queues them up. This is
represented in mongostat by the read and write queue columns. When this starts to increase
you will slowdowns in executing queries as they have to wait to run through the queue. You
can alleviate this by stopping any more queries until the queue has dissipated. Queues will
tend to spike if you’re doing a lot of write operations alongside other write heavy ops, such
as large ranged removes. The second column it the active read and writes.
mongostat

5) Diagnostics




The last three columns show the total number of connections per server, the replica set they
belong to and the status of that server. This is useful if you need to quickly see which server
is a master in a replica set.
Current operations
    db.currentOp();
    {
    ! ! ! "opid" : "shard1:299939199",
    ! ! ! "active" : true,
    ! ! ! "lockType" : "write",
    ! ! ! "waitingForLock" : false,
    ! ! ! "secs_running" : 15419,
    ! ! ! "op" : "remove",
    ! ! ! "ns" : "sd.metrics",
    ! ! ! "query" : {
    ! ! ! ! "accId" : 1391,
    ! ! ! ! "tA" : {
    ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z")
    ! ! ! ! }
    ! ! ! },
    ! ! ! "client" : "10.121.12.228:44426",
    ! ! ! "desc" : "conn"
    ! ! },
www.flickr.com/photos/jeffhester/2784666811/
The db.currentOp() function will give you a full list of every operation currently in progress. In
this case there’s a long runnin remove which has been active for over 4 hours. You can see
that it’s targeted at shard 1 and the query is based on an account ID and a timestamp. It’s
part of our retention scripts to remove older metrics data. This is useful because you can
track down long running queries which might be hurting performance, and kill them off using
the opid.
Monitoring tools

Server Density
Monitoring tools




www.mongomonitor.com
Recap
Recap

Keep it in RAM
Recap

Keep it in RAM

Watch your storage
Recap

Keep it in RAM

Watch your storage

db.serverStatus()

rs.status()
David Mytton

 @davidmytton

david@boxedice.com

www.mongomonitor.com

Woop Japan!

Weitere ähnliche Inhalte

Was ist angesagt?

スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudy
スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudyスローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudy
スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudyYusuke Yamamoto
 
Prometheus Storage
Prometheus StoragePrometheus Storage
Prometheus StorageFabian Reinartz
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedAlexey Lesovsky
 
glance replicator
glance replicatorglance replicator
glance replicatoririx_jp
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).Alexey Lesovsky
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumpsTier1app
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterAlexey Lesovsky
 
Apache con na_2013_updated_2016
Apache con na_2013_updated_2016Apache con na_2013_updated_2016
Apache con na_2013_updated_2016muellerc
 
Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!C2B2 Consulting
 
C*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraC*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraDataStax
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBMongoDB
 
Use Ruby GC in full..
Use Ruby GC in full..Use Ruby GC in full..
Use Ruby GC in full..Alex Mercer
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Go破壊
Go破壊Go破壊
Go破壊Hattori Hideo
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialJason Terpko
 
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)Ontico
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1상욱 송
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종NAVER D2
 

Was ist angesagt? (20)

スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudy
スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudyスローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudy
スローダウン、ハングを一発解決 スレッドダンプはトラブルシューティングの味方 #wlstudy
 
Prometheus Storage
Prometheus StoragePrometheus Storage
Prometheus Storage
 
ZODB Tips and Tricks
ZODB Tips and TricksZODB Tips and Tricks
ZODB Tips and Tricks
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 
glance replicator
glance replicatorglance replicator
glance replicator
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumps
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
Apache con na_2013_updated_2016
Apache con na_2013_updated_2016Apache con na_2013_updated_2016
Apache con na_2013_updated_2016
 
Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!Jsr107 come, code, cache, compute!
Jsr107 come, code, cache, compute!
 
C*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with CassandraC*ollege Credit: Creating Your First App in Java with Cassandra
C*ollege Credit: Creating Your First App in Java with Cassandra
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB
 
Use Ruby GC in full..
Use Ruby GC in full..Use Ruby GC in full..
Use Ruby GC in full..
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Go破壊
Go破壊Go破壊
Go破壊
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster Tutorial
 
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)
ToroDB: scaling PostgreSQL like MongoDB / Álvaro Hernåndez Tortosa (8Kdata)
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 

Andere mochten auch

MongoDB For Online Advertising at AOL
MongoDB For Online Advertising at AOLMongoDB For Online Advertising at AOL
MongoDB For Online Advertising at AOLJon_Reed
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleMongoDB
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Itamar Haber
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming WorldBen Stopford
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
MongoDB for Time Series Data: Schema Design
MongoDB for Time Series Data: Schema DesignMongoDB for Time Series Data: Schema Design
MongoDB for Time Series Data: Schema DesignMongoDB
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB
 
BuzzFeed Pitch Deck
BuzzFeed Pitch DeckBuzzFeed Pitch Deck
BuzzFeed Pitch DeckTech in Asia ID
 
Contently Pitch Deck
Contently Pitch DeckContently Pitch Deck
Contently Pitch DeckRyan Gum
 
Pendo Series B Investor Deck External
Pendo Series B Investor Deck ExternalPendo Series B Investor Deck External
Pendo Series B Investor Deck ExternalTodd Olson
 
Tinder Pitch Deck
Tinder Pitch DeckTinder Pitch Deck
Tinder Pitch DeckRyan Gum
 
Airbnb Pitch Deck From 2008
Airbnb Pitch Deck From 2008Airbnb Pitch Deck From 2008
Airbnb Pitch Deck From 2008Ryan Gum
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!Eoghan McCabe
 
Front series A deck
Front series A deckFront series A deck
Front series A deckMathilde Collin
 
Mattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckMattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckDanielle Morrill
 
The investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsThe investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsMikael Cho
 
Foursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckFoursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckRami Al-Karmi
 
Linkedin Series B Pitch Deck
Linkedin Series B Pitch DeckLinkedin Series B Pitch Deck
Linkedin Series B Pitch DeckJoseph Hsieh
 

Andere mochten auch (20)

MongoDB For Online Advertising at AOL
MongoDB For Online Advertising at AOLMongoDB For Online Advertising at AOL
MongoDB For Online Advertising at AOL
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)
 
Microservices for a Streaming World
Microservices for a Streaming WorldMicroservices for a Streaming World
Microservices for a Streaming World
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
MongoDB for Time Series Data: Schema Design
MongoDB for Time Series Data: Schema DesignMongoDB for Time Series Data: Schema Design
MongoDB for Time Series Data: Schema Design
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
 
BuzzFeed Pitch Deck
BuzzFeed Pitch DeckBuzzFeed Pitch Deck
BuzzFeed Pitch Deck
 
Contently Pitch Deck
Contently Pitch DeckContently Pitch Deck
Contently Pitch Deck
 
Pendo Series B Investor Deck External
Pendo Series B Investor Deck ExternalPendo Series B Investor Deck External
Pendo Series B Investor Deck External
 
Tinder Pitch Deck
Tinder Pitch DeckTinder Pitch Deck
Tinder Pitch Deck
 
Airbnb Pitch Deck From 2008
Airbnb Pitch Deck From 2008Airbnb Pitch Deck From 2008
Airbnb Pitch Deck From 2008
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!
 
Front series A deck
Front series A deckFront series A deck
Front series A deck
 
Mattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckMattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A Deck
 
The investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsThe investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollars
 
Foursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckFoursquare's 1st Pitch Deck
Foursquare's 1st Pitch Deck
 
Linkedin Series B Pitch Deck
Linkedin Series B Pitch DeckLinkedin Series B Pitch Deck
Linkedin Series B Pitch Deck
 

Ähnlich wie MongoDB Tokyo - Monitoring and Queueing

MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...ronwarshawsky
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Norberto Leite
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersMongoDB
 
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Let the Tiger Roar!
Let the Tiger Roar!Let the Tiger Roar!
Let the Tiger Roar!MongoDB
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceTim Callaghan
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorialJaehong Park
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefGaurav "GP" Pal
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4Gaurav "GP" Pal
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalVigyan Jain
 
Slices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarSlices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarGlobalLogic Ukraine
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBDaniel Coupal
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuningMongoDB
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment StrategiesMongoDB
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/dayLars Marius Garshol
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...MongoDB
 

Ähnlich wie MongoDB Tokyo - Monitoring and Queueing (20)

MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
MongoDB performance tuning and load testing, NOSQL Now! 2013 Conference prese...
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
 
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Let the Tiger Roar!
Let the Tiger Roar!Let the Tiger Roar!
Let the Tiger Roar!
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
 
Slices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr BodnarSlices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr Bodnar
 
Silicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDBSilicon Valley Code Camp 2014 - Advanced MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuning
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness Platform
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
 
[AWS Builders] Effective AWS Glue
[AWS Builders] Effective AWS Glue[AWS Builders] Effective AWS Glue
[AWS Builders] Effective AWS Glue
 

KĂźrzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂşjo
 

KĂźrzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

MongoDB Tokyo - Monitoring and Queueing

  • 1. MongoDB Queueing & Monitoring
  • 2.
  • 3. •Server Density •26 nodes •6 replica sets •Primary datastore = 15 nodes
  • 4. •Server Density •+7TB / mth •+1bn docs / mth •2-5k inserts/s @ 3ms We use MongoDB as our primary data store but also as a queueing system. So I’m going to talk rst about how we built the queuing functionality into Mongo and then more generally about what you need to keep an eye on when monitoring MongoDB in production.
  • 6. Queuing: Uses • Background processing www.flickr.com/photos/triplexpresso/496995086/
  • 7. Queuing: Uses • Background processing • Sending notications www.flickr.com/photos/triplexpresso/496995086/
  • 8. Queuing: Uses • Background processing • Sending notications • Event streaming www.flickr.com/photos/triplexpresso/496995086/ Asynchronous
  • 13. Queuing: Features • Consumers • Atomic • Speed • GC
  • 15. Queuing: Features •Consumers MongoDB RabbitMQ Mongo Wire AMQP Protocol If you’re building a queue connecting via - RabbitMQ AMQP. Mongo Wire
  • 17. Queuing: Features •Atomic MongoDB RabbitMQ ndAndModify consume/ack en.wikipedia.org/wiki/State_of_matter
  • 20. Queuing: Features •GC MongoDB RabbitMQ ☚ consume/ack
  • 21. Implementation • Consumers 2 things we need to implement - consumers and GC
  • 22. Implementation • Consumers db.runCommand( { ndAndModify : <collection>, <options> } ) ndAndModify command takes 2 parameters - collection and options.
  • 23. Implementation • Consumers db.runCommand( { ndAndModify : <collection>, <options> } ) query: lter (WHERE) { query: { inProg: false } } Specify the query just like any normal query against Mongo. The very rst document that matches this will be returned. Since we’re building a queuing system, we’re using a eld called inProg so we’re asking it to give us documents where this is false - i.e. the processing of that document isnt in progress.
  • 24. Implementation • Consumers db.runCommand( { ndAndModify : <collection>, <options> } ) update: modier object { update: { $set: {inProg: true, start: new Date()} } } Atomic update.
  • 25. Implementation • Consumers db.runCommand( { ndAndModify : <collection>, <options> } ) sort: selects the rst one on multi-match { sort: { added: -1 } } We can also sort e.g. on a timestamp so you can return the oldest documents rst, or you could build a priority system to return more important documents rst.
  • 26. Implementation • Consumers db.runCommand( { ndAndModify : <collection>, <options> } ) remove: true = deletes on return new: true = returns modied object elds: return specic elds upsert: true = create object if !exists()
  • 28. Implementation • GC now = datetime.datetime.now() difference = datetime.timedelta(seconds=10) timeout = now - difference queue.find({'inProg' : True, 'start' : {'$lte' : timeout} })
  • 33. It’s a little different, but not entirely new. The problem is that MongoDB is fairly new and whilst it’s still just another database running on a server, there are things that are new and unusual. This means that some old assumptions are still valid, but others aren’t. You don’t have to approach it as a completely new thing, but it is a little different. There are disadvantages to this but one advantage is you can use it for novel tasks, like queuing.
  • 34. Keep it in RAM. Obviously. www.flickr.com/photos/comedynose/4388430444/ The rst and most obvious thing to note is that keeping everything in RAM is faster. But what does that actually mean and how do you know when something is in RAM?
  • 35. How do you know? > db.stats() { ! "collections" : 3, ! "objects" : 379970142, ! "avgObjSize" : 146.4554114991488, ! "dataSize" : 55648683504, 51GB ! "storageSize" : 61795435008, ! "numExtents" : 64, ! "indexes" : 1, ! "indexSize" : 21354514128, 19GB ! "fileSize" : 100816388096, ! "ok" : 1 } http://www.flickr.com/photos/comedynose/4388430444/ The easiest way is to check the database size. The MongoDB console provides an easy way to look at the data and index sizes, and the output is provided in bytes.
  • 36. Where should it go? Should it be in What? memory? Indexes Always Data If you can http://www.flickr.com/photos/comedynose/4388430444/ In every case, having something in memory is going to be faster than not. However, that’s not always feasible if you have massive data sets. Instead, you want to make sure you always have enough RAM to store all the indexes, which is what the db.stats() output is for. And if you can, have space for data too. MongoDB is smart about its memory management so it will keep commonly accessed data in RAM where possible.
  • 37. How you’ll know 1) Slow queries Thu Oct 14 17:01:11 [conn7410] update sd.apiLog query: { c: "android/setDeviceToken", a: 1466, u: "blah", ua: "Server Density Android" } 51926ms www.flickr.com/photos/tonivc/2283676770/ Although not the only reason, a slow query does indicate insufficient memory. This might be that you’ve not got the most optimal indexes for a query but if indexes are being used and it’s still slow, it could be because of a disk i/o bottleneck because the data isn’t in RAM. Doing an explain on the query will show you what indexes it is using.
  • 38. How you’ll know 2) Timeouts cursor timed out (20000 ms) These slow queries will obviously cause a slowdown in your app but they may also cause timeouts. In the PHP driver a cursor will timeout after 20,000ms by default, although this is congurable.
  • 39. How you’ll know 3) Disk i/o spikes www.flickr.com/photos/daddo83/3406962115/ You’ll see write spikes because MongoDB syncs data to disk periodically, but if you’re seeing read spikes then that can indicate MongoDB is having to read the data les rather than accessing data from memory. Be careful though because this won’t distinguish between data and indexes, or even other server activity. Read spikes can also occur even if you have little or no read activity if the mongod is part of a cluster where the slaves are reading from the oplog.
  • 40. Watch your storage 1) Pre-alloc It sounds obvious but our statistics show that people run out disk space suddenly, even though there is a predictable increase over time. Remember that MongoDB pre-allocates les before the space is used, so you’ll see your storage being used up in 2GB increments (once you go past the smaller initial data le sizes).
  • 41. Watch your storage 2) Sharding maxSize When adding a new shard you can specify the maximum amount of data you want to store on that shard. This isn’t a hard limit and is instead used as a guide. MongoDB will try to keep the data balanced across all your shards so that it meets this setting but it may not. MongoDB doesn’t currently look at actual disk levels and assumes available capacity is the same across all nodes. As such, it’s advisable that you set this to around 70% of the total available disk space.
  • 42. Watch your storage 3) Logging --quiet db.runCommand("logRotate"); killall -SIGUSR1 mongod Logging is verbose by default, so you’ll want to use the quiet option to ensure only important things are output. And assuming you’re logging to a log le, you will want to periodically rotate it via the MongoDB console so that it doesn’t get too big. You can also do a killall SIGUSR1 on all your mongod processes from the shell which will cause a log rotation (because of the SIGUSR1 flag). This is useful if you want to script log rotation or put it into a cron job.
  • 43. Watch your storage 4) Journaling david@rs2b ~: ls -alh /mongodbdata/journal/ total 538M drwxrwxr-x 2 david david 29 Mar 20 16:50 . drwx------ 4 david david 4.0K Mar 13 09:50 .. -rw------- 1 david david 538M Mar 20 17:00 j._862 -rw------- 1 david david 88 Mar 20 17:00 lsn Mongo should rotate the journal les often but you need to remember that they will take up some space too, and as new les are allocated and old ones deleted, you may see your disk usage spiking up and down.
  • 44. db.serverStatus() The server status command provides a lot of different statistics that can help you, like this map of traffic in central Tokyo.
  • 45. db.serverStatus() 1) Used connections www.flickr.com/photos/armchaircaver/2061231069/ Every connection to the database has an overhead. You want to reduce this number by using persistent connections through the drivers.
  • 46. db.serverStatus() 2) Available connections Every server has its limits. If you run out of available connections then you’ll have a problem, which will look like this in the logs.
  • 47. Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open files Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters: [ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 } MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exception Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exception Fri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exception Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018 Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostname We’ve recently had this problem and it manifests itself by the logs lling up all available disk Fri Nov 19 17:24:34 [conn2343] reconnect rs2d:27018 failed space instantly, and in some cases completely crashing the server. Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2c:27018 Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] reconnect rs2c:27018 failed Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2a:27018 Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostname Fri Nov 19 17:24:34 [conn2343] reconnect rs2a:27018 failed Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE) Fri Nov 19 17:24:35 [conn2343] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters: [ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 } MessagingPort say send() errno:9 Bad file descriptor (NONE)
  • 48. connPoolStats > db.runCommand("connPoolStats") { ! "hosts" : { ! ! "config1:27019" : { ! ! ! "available" : 2, ! ! ! "created" : 6 ! ! }, ! ! "set1/rs1a:27018,rs1b:27018" : { ! ! ! "available" : 1, ! ! ! "created" : 249 ! ! }, ... ! }, ! "totalAvailable" : 5, ! "totalCreated" : 1002, ! "numDBClientConnection" : 3490, ! "numAScopedConnection" : 3, } connPoolStats allows you to see the connection pools that have been set up by a mongos to connect to different members of the replica set shards. This is useful to correlate against open le descriptors so you can see if there are suddenly a large number of connections, or if there are a low number of available connections across your entire cluster.
  • 49. db.serverStatus() 3) Index counters "indexCounters" : { ! ! "btree" : { ! ! ! "accesses" : 15180175, ! ! ! "hits" : 15178725, ! ! ! "misses" : 1450, ! ! ! "resets" : 0, ! ! ! "missRatio" : 0.00009551932 ! ! } ! }, The miss ratio is what you’re looking at here. If you’re seeing a lot of index misses then you need to look at your queries to see if they’re making optimal use of the indexes you’ve created. You should consider adding new indexes and seeing if your queries run faster as a result. You can use the explain syntax to see which indexes queries are hitting, and the total execution time so you can benchmark them before and after.
  • 50. db.serverStatus() 4) Op counters www.flickr.com/photos/cosmic_bandita/2395369614/ The op counters - inserts, updates, deletes and queries - are fun to look at, especially if the numbers are high. But you have to be careful these are not just vanity metrics. There are some things you can use them for though. If you have a high number of inserts and updates, i.e. writes, then you may want to look at your fsync time setting. By default this will flush to disk every 60 seconds but if you’re doing thousands of writes per second you might want to do this sooner for durability. Of course you can also ensure the write happens from within the driver. Queries can show whether you need to load off reads to your slaves, which can be done through the drivers, so that you’re spreading the load across your servers and only writing to the master. Deletes can also cause concurrency problems if you’re doing a large number of them and the database keeps having to yield.
  • 51. db.serverStatus() 5) Background flushing Picture is unrelated! Mmm, ice cream. The server status output allows you to see the last time data was flushed to disk, and how long that took. This is useful to see if you’re causing high disk load but also so you can monitor how often data is being written. Remember that whilst it isn’t synced to disk, you could experience data loss in the event of a crash or power outage.
  • 52. db.serverStatus() 6) Dur If you have journalling enabled then serverStatus will also show some stats such as how many commits have occurred, the amount of data written and how long various operations have taken. This can be useful for seeing how much overhead durability adds to servers. We’ve found no noticeable difference when enabling journaling and that’s on servers processing billions of operations.
  • 53. rs.status() { ! "_id" : 1, ! "name" : "rs3b:27018", ! "health" : 1, ! "state" : 2, ! "stateStr" : "SECONDARY", ! "uptime" : 1886098, ! "optime" : { ! ! "t" : 1291252178000, ! ! "i" : 13 ! }, ! "optimeDate" : ISODate("2010-12-02T01:09:38Z"), "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") }, www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek) If you’re running a replica set then you can use the rs.status() command to get information about the whole replica set, on any set member. This gives you a few stats about the current member as well as a full list of every member in the set.
  • 54. rs.status() 1) myState Value Meaning 0 Starting up (phase 1) 1 Primary 2 Secondary 3 Recovering 4 Fatal error 5 Starting up (phase 2) 6 Unknown state 7 Arbiter 8 Down en.wikipedia.org/wiki/State_of_matter The rst value is myState which shows you the status of the server you executed the command on. However, it’s also used in the list of members the command also provides so you can see the state of any member in the replica set, as that member sees it. This is useful to understand why members might be down because other members can’t see them.
  • 55. rs.status() 2) Optime "optimeDate" : ISODate("2010-12-02T01:09:38Z") www.flickr.com/photos/robbie73/4244846566/ Replica set members who are not master will be secondary, which means they’ll act as a slave staying up to date with the master. The optimeDate allows you to see whether a member is behind on the replication sync. The timestamp is the last applied log item so if it’s up to date, it’ll be very close to the current actual time on the server.
  • 56. rs.status() 3) Heartbeat "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") www.flickr.com/photos/drawblindfaith/3400981091/ The whole idea behind replica sets is that they automate failover in the event of failure somewhere. This is done by a regular heartbeat that all members send out to all other members. The status output shows you the last time that particular member was contacted from the current member. In the event of a network partition it may be that some members can’t communicate with eachother, and when there is an error you’ll see it in this section too.
  • 57. mongostat The mongostat tool is included as part of the standard MongoDB download and gives you a quick, real time snapshot of the current state of your servers.
  • 58. mongostat 1) faults Picture is unrelated! Snowmobile in Norway. The faults column shows you the number of Linux page faults per second. This is when Mongo accesses something that is mapped to the virtual address space but not in physical memory. i.e. it results in a read from disk. High values here indicate you may not have enough RAM to store all necessary data and disk accesses may start to become the bottleneck.
  • 59. mongostat 2) locked www.flickr.com/photos/bbusschots/4541573665/ The next column is locked, which shows the % of time in a global write lock. When this is happening no other queries will complete until the lock is given up, or the lock owner yields. This is indicative of a large, global operation like a remove() or dropping a collection and can result in slow performance.
  • 60. mongostat 3) index miss www.flickr.com/photos/gareandkitty/276471187/ Index miss is like we saw in the server status output except instead of an aggregate total, you can see queries hitting (or missing) the index in real time. This is useful if you’re debugging specic queries in development or need to track down a server that is performing badly.
  • 61. mongostat 4) queues When MongoDB gets too many queries to handle in real time, it queues them up. This is represented in mongostat by the read and write queue columns. When this starts to increase you will slowdowns in executing queries as they have to wait to run through the queue. You can alleviate this by stopping any more queries until the queue has dissipated. Queues will tend to spike if you’re doing a lot of write operations alongside other write heavy ops, such as large ranged removes. The second column it the active read and writes.
  • 62. mongostat 5) Diagnostics The last three columns show the total number of connections per server, the replica set they belong to and the status of that server. This is useful if you need to quickly see which server is a master in a replica set.
  • 63. Current operations db.currentOp(); { ! ! ! "opid" : "shard1:299939199", ! ! ! "active" : true, ! ! ! "lockType" : "write", ! ! ! "waitingForLock" : false, ! ! ! "secs_running" : 15419, ! ! ! "op" : "remove", ! ! ! "ns" : "sd.metrics", ! ! ! "query" : { ! ! ! ! "accId" : 1391, ! ! ! ! "tA" : { ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z") ! ! ! ! } ! ! ! }, ! ! ! "client" : "10.121.12.228:44426", ! ! ! "desc" : "conn" ! ! }, www.flickr.com/photos/jeffhester/2784666811/ The db.currentOp() function will give you a full list of every operation currently in progress. In this case there’s a long runnin remove which has been active for over 4 hours. You can see that it’s targeted at shard 1 and the query is based on an account ID and a timestamp. It’s part of our retention scripts to remove older metrics data. This is useful because you can track down long running queries which might be hurting performance, and kill them off using the opid.
  • 65.
  • 66.
  • 67.
  • 69. Recap
  • 71. Recap Keep it in RAM Watch your storage
  • 72. Recap Keep it in RAM Watch your storage db.serverStatus() rs.status()