MongoDB - visualisation of slow operations

1
MongoDB
visualisation of slow operations
Kay Agahd
4 June 2013

2
idealo and MongoDB
●
idealo = Europe's leading price comparison web site
●
Germany, Austria, United Kingdom, France, Italy, Poland and Spain
●
250 millions offers online (May 2013)
●
fast growing
●
different types of databases (MySQL, Oracle, MongoDB)
●
MongoDB in production since v1.6
●
sharding in production since MongoDB v1.8
●
MongoDB stores offers for back-end usage
●
30 mongoDB servers for offerStore + 3 servers for offerHistory
●
15 mongoDB servers for other purposes
●
nearly 15 TB of data all together

3
Review profiling
●
MongoDB supports profiling of “slow” operations
●
“slow” is a threshold to be set when turning profiling on (default 100 ms)
●
profiling per-database or per-instance on a running mongod
●
profiler writes collected data to a capped collection “system.profile”

4
Example of slow op entry (1/2)
{
"ts" : ISODate("20130405T01:41:31.710Z"),
"op" : "getmore",
"ns" : "offerStore.offer",
"query" : {
“query” : {
"shopId" : 123,
"onlineProductIds" : {"$ne" : null},
"smallPicture" : {"$ne" : null},
"_id" : {"$gt" : 1555008076},
"lastChange" : {"$gt" : ISODate("20130402T22:00:00Z")}
},
“orderby” : {
“_id” : 1
}
},
"cursorid" : NumberLong("5773493375904448215"),
"ntoreturn" : 500,
...

5
Example of slow op entry (2/2)
...
"keyUpdates" : 0,
"numYield" : 2350,
"ts" : "lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(8724165),
"w" : NumberLong(0)
},
"timeAcquiringMicros" : {
"r" : NumberLong(5321722),
"w" : NumberLong(7)
}
},
"nreturned" : 500,
"responseLength" : 94656,
"millis" : 5322,
"client" : "172.16.65.202",
"user" : "pl_parser"
}

6
Inconveniences
●
each mongod needs to be handled separately
●
replSet: connect to master and every slave
●
sharding: incomplete view through router, thus replSet * n shards
●
gives only a view on a limited time span due to capped collection
●
different formats of “query” field makes querying more difficult
●
bug: ops through mongos omit the user (JIRA: SERVER-7538)

7
Example of different formats/schemata
- “query” as flat document:
{ "query" : { "shopId" : 123,
    "onlineProductIds" : { "$ne" : null } },
   "user" : "pl_parser"}
- “query” embedded:
{"query" : { "query" : { "shopId" : 123,
   "orderby" : { "_id" : NumberLong(1) } },
"user" : "pl_parser"}
- “query” embedded as $query:
{ "query" : { "$query" : { "shopId" : 123,
              "$orderby" : { "_id" : NumberLong(1) },
              "$comment" : "profiling comment" },
"user" : "pl_parser" }

8
idealo requirements
●
quick overview of types of slow-ops and their quantity within a time period
(“types” means op type, user, server, queried and sorted fields)
●
historical view to see how slow-ops evolve to extrapolate them
●
discovering spikes in time or in slow-op types
●
filtering by slow-op types and/or time range to drill down

9
Goals
●
faster queries
●
better adapted indexes
●
better adapted data schema
●
higher throughput by smarter workflow

10
Steps to go
●
two global steps:
●
1) collect and aggregate slow ops from all mongod's into one global
collection
●
2) GUI to query and show results

11
Step 1 of 2
●
global collection:
●
allows easy and fast querying of the whole mongoDB (shard) system
●
keeps historical data (no capped collection)
●
located on another replSet to avoid interfering with profiled mongod's
●
collector:
●
guarantee that only 1 instance is running at once (or add logic to avoid
doubled entries)
●
use tailable cursors to collect data from profiled mongod's
●
in case of failure: reconnect before data gets overwritten but avoid DoS
●
monitor it (nagios etc.)
●
profiled entries:
●
reduce size by keeping only interesting fields
●
make them easier to query (i.e. only 1 schema)
●
aggregate fields inside “query” and “orderby” to values
●
choose short field names

12
slow-op example
●
slow-op example of above becomes:
{
"_id" : ObjectId("512e43099bbcf52b9aff3602"),
"ts" : ISODate("20130405T01:41:31.710Z"),
"adr" : "s233.ipx",
"op" : "getmore",
"fields" :
["shopId","onlineProductIds","smallPicture","_id",“lastChange“],
"sort" : ["_id”],
"nret" : 500,
"reslen" : 94656,
"millis" : 5322,
"user" : "pl_parser"
}

13
Step 2 of 2
●
GUI:
●
x-axis = execution time
●
y-axis = duration of slow op
●
size of point = quantity of slow-op type
●
zoomable in x or y axis

14
How to query slow ops
●
group by time component allows resolution by year, month, week etc.
●
group by server address, user, operation, queried fields and sorted fields
allows to define different slow-op types
●
filter allows to focus on time period and specific slow ops
●
use slavePreferred option
●
error handling, i.e. result exceeds max of 16 MB

15
Query example
{$match:{ts : {$gt : #, $lt : # }}},
   fields : {$all : ["_id","shopId","bokey"]}
{$group:{_id : {op : "$op",
   user : "$user",
   fields : "$fields",
   year : { $year : "$ts" },
   month : { $month : "$ts" },
   dayOfMonth : { $dayOfMonth : "$ts" },
   hour : { $hour : "$ts" }},
   count : { $sum : 1 },
millis : { $sum : "$millis" },
avgMs : { $avg : "$millis" },
minMs : { $min : "$millis" },
maxMs : { $max : "$millis" },
firstts : { $first : "$ts" }}},
{ $sort:{ firstts : 1 }}
Filter
Slow-op
Resolution
Data

18
Resolution by minute & filter

19
dygraph.js
●
general syntax:
<script type="text/javascript">
   g = new Dygraph(document.getElementById("graph"),
    "xname,   graph1name,   graph2name,   ..., graphNnamen" +
    "xvalue1, graph1value1, graph2value1, ..., graphNvalue1n" +
    "xvalue2, graph1value2, graph2value2, ..., graphNvalue2n" +
    ...
    "xvalueN, graph1valueN, graph2valueN, ..., graphNvalueNn"
   );
</script>
●
example for 2 slow-op types:
"Date,op=query;fields=[_id;shopId],n,min,max,op=query;fields=[_id],n,min,maxn" +
"2013/03/17,  5.4, 10, 3.2,  7.8,            10.4, 123, 3.1, 20.2n" +
"2013/03/18, 12.4, 23, 3.4, 55.8,               0,   0,   0,    0n" +
"2013/03/19,    0,  0,   0,    0,            33.5,  66, 3.1, 89.3n"
   );
</script>

20
dygraph.js Options 1/3
●
hide legend values from being drawn as graph:
  "Date,op=query;fields=[_id;shopId],n,min,max,op=query;fields=[_id],n,min,maxn" +
  "2013/03/17, 5.4, 10, 3.2, 7.8, 10.4, 123, 3.1, 20.2n" +
  "2013/03/18, 12.4, 23, 3.4, 55.8, 0, 0, 0, 0n" +
  "2013/03/19, 0, 0, 0, 0, 33.5, 66, 3.1, 89.3n",
  {//options:
    visibility:[true, false, false, false, true, false, false, false],
    showLabelsOnHighlight:false,
    hideOverlayOnMouseOut:false,
    labelsSeparateLines: true,
    drawPoints: true,
    legend: "always",
    xlabel: "Date",
    ylabel: "seconds",
    ... more options ...
  }
);

21
●
show custom legend on mouse over:
highlightCallback: function(e, x, pts, row) {
  var text = "";
  var legend = new Array();
  for (var i = 0; i < pts.length; i++) {
    var rangeY = g.yAxisRange();
      if(pts[i].yval >= rangeY[0] && pts[i].yval <= rangeY[1]){//hide outside series
        var seriesProps = g.getPropertiesForSeries(pts[i].name);
        var count = g.getValue(row, seriesProps.column+1);
        var minSec = g.getValue(row, seriesProps.column+2);
        var maxSec = g.getValue(row, seriesProps.column+3);
        if(pts[i].yval != 0 && count != 0){
          legend.push([seriesProps.color, pts[i], count, minSec, maxSec]);
        }}}//end for
  legend.sort(function(a,b){return b[1].yvala[1].yval});//sort by yvalues
  for (var i = 0; i < legend.length; i++) {
    text += "<span style='color: " + legend[i][0] + ";'> " + legend[i][1].name +
"</span><br/><span>" + Dygraph.dateString_(legend[i][1].xval) + " count:" +
legend[i][2] + " minSec:" + legend[i][3] + " maxSec:" + legend[i][4] + "avgSec:" +
legend[i][1].yval + " </span><br/>";
  }
  document.getElementById("status").innerHTML = text; }, ... more options ...

22
●
draw circles with surface of count:
  drawPointCallback : function(g, seriesName, ctx, cx, cy, color, pSize){
    if(lastSeries != seriesName || isNaN(currentRow) ){
lastSeries = seriesName;
currentRow = g.getLeftBoundary_() 1;
    }
    currentRow++;
    var col = g.indexFromSetName(seriesName);
    var count = g.getValue(currentRow, col+1);
    ctx.strokeStyle = color;
    ctx.lineWidth = 0.8;
    ctx.beginPath();
    ctx.arc(cx, cy, Math.sqrt(count/Math.PI), 0, 2 * Math.PI, false);
    ctx.closePath();
    ctx.stroke();
  }
}//end options
);//end dygraph

24
Collector read/write status

MongoDB - visualisation of slow operations

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie MongoDB - visualisation of slow operations

Ähnlich wie MongoDB - visualisation of slow operations (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

MongoDB - visualisation of slow operations