Mongo db mug_2012-02-07

O Where Clause, Where Clause!
Wherefore art thou Where Clause?

(a.k.a. Aggregation for Reporting)

Overview
• Discuss and demonstrate aggregating data
• Specifically addresses reporting needs
• Example study: Aggregating Video Game Stats

Dataset
{
"_id" : ObjectId("50fc77ee364c74eba1afe1e3"),
"fragdate" : ISODate("2012-12-24T00:00:19.901Z"),
"gameId" : 1221, Aggregate the
"gameName" : "Christmas Blitz",
"kill" : { number of times each
"_id" : ObjectId("50acfd45712e8bc7832ea7cb"), player was killed
"username" : "player002",
"avatar" : "avatar.com/player002.png",
"displayname" : "Sniper the Clown",
"rank" : "Sniper",
"motto" : "If you run, you'll just die tired."
},
"player" : {
"userid" : 1,
"username" : "ArmyD00d1221",
"avatar" : "avatar.com/armyd00d1221.png",
"displayname" : "Army Grunt"
},
"server" : "app01.fragzilla.com"
}

Report Details
{
"_id" : ObjectId("50fc77ee364c74eba1afe1e3"),
"fragdate" : ISODate("2012-12-24T00:00:19.901Z"), Only aggregate kills on
"gameId" : 1221, these three players:
"gameName" : "Christmas Blitz",
"kill" : { • Sniper the Clown
"_id" : ObjectId("50acfd45712e8bc7832ea7cb"), • Kurious Killer
"username" : "player002",
"avatar" : "avatar.com/player002.png",
• My L1ttl3 P0wn13
"rank" : "Sniper",
"motto" : "If you run, you'll just die tired."
},
"player" : { Only on Dec 23, 2012
"userid" : 1, Between 2pm and 10pm
"username" : "ArmyD00d1221",
"avatar" : "avatar.com/armyd00d1221.png",
"displayname" : "Army Grunt"
},
"server" : "app01.fragzilla.com"
}

Relational DB
Killed
Kills Id
Id username
fragDate avatar
gameID displayName
gameName rank
server motto
fkKilled
fkPlayer
Player
Id
username
avatar Could be the
displayName same table
rank
motto

Relational DB
Killed
Kills Id
Id username
avatar
fragDate SELECT tk.fragDate, k.id, count(k.id) FROM test.kills tk
gameID JOIN players p ON tk.fkPlayer = p.id displayName
gameName JOIN killed k ON tk.fkKilled = k.id rank
server WHERE k.id IN (1,2,3) motto
fkKilled GROUP BY fragDate, k.id;
fkPlayer
Player
Id
username
avatar Could be the
displayName same table
rank
motto

Sidenote: Exploration
• Software Engineering tends to have more
clearly defined goals

• Report Engineering tends to have more clearly
defined questions

Display in Excel, R, Processing, etc

Aggregation: Big Picture
Mongo
Mongo Map/Reduce
Aggregation
Queries Implementations
Framework

Complexity

• Somewhere between Mongo Queries and
Map/Reduce implementations
• Best suited for totaling and averaging functions
• Similar functionality to SQL Group By clause

Anatomy of Aggregation Framework
db.collection.aggregate( Aggregate command

[ {do something},
{do something else}, Pipeline
Operators
{do even more stuff}
]
)

Pipeline Operators
• Pipelines: transforms documents from the
collection as they pass through
– grep e server.log | less

• Expressions: produce output documents
based on calculations performed on input
documents

Pipelines:
$project
$match
$limit
$skip
$unwind
$group
$sort

Expressions
$group Operators: Boolean Operators:
Comparison Operators:
$addToSet $and
$cmp
$first $or
$eq
$last $not
$gt
$max
$lt
$min
$ne
$avg
$sum

Arithmetic Operators: String Operators: Date Operators:
$add $strcasecmp $year
$subtract $substr $month
$multiply $toLower $hour
$divide $toUpper
See http://docs.mongodb.org/manual/reference/aggregation/#aggregation-expression-operators
For an exhaustive list

Our Aggregation Query

All the magic goes
between the []


$match:
Provides a query-like interface to filter
documents out of the aggregation
pipeline. The $match drops
documents that do not match the
condition from the aggregation
pipeline, and it passes documents that
match along the pipeline unaltered.


$project: Reshapes a document
stream by renaming, adding, or
removing fields. Also use $project to
create computed values or sub-
objects


$group
Groups documents together for the
purpose of calculating aggregate
values based on a collection of
documents. Practically, group often
supports tasks such as average page
views for each page in a website on a
daily basis.


“numKills”: { $sum: “$numKills” }

$sort
The $sort pipeline operator sorts all
input documents and returns them to
the pipeline in sorted order.

{ $sort : { <sort-key> } }

Aggregation Output
{
"result" : [
{
"_id" : {
"displayname" : "My L1ttl3 P0wn13",
"eventhour" : 21
},
"numKills" : 133
},
{
"_id" : { Produces a document with
"displayname" : "Kurious Killer",
two fields: result and ok
"eventhour" : 21
},
"numKills" : 130
},
// ******* Omitted for brevity *******
{
"_id" : {
"eventhour" : 2
},
"numKills" : 6
}
],
"ok" : 1
}

Aggregation Output
{
"result" : [
{
"_id" : {
"displayname" : "My L1ttl3 P0wn13",
"eventhour" : 21
},
"numKills" : 133
},
{
"_id" : {
"displayname" : "Kurious Killer",
"eventhour" : 21
},
"numKills" : 130
},
// ******* Omitted for brevity *******
{
"_id" : {
"eventhour" : 2
},
"numKills" : 6
}
],
"ok" : 1
}

Recap: Aggregation Framework
db.collection.aggregate( Aggregate command

[ {do something},
{do something else}, Pipeline
Operators
{do even more stuff}
]
)

We’re not quite done…
We can’t really give something like this to our
customers:

But if we had…

A database config.

But if we had…

To run our aggregation.

But if we had…

Inside a node server.

Q/A/Comments

Will Button
willb@mylist.com
@wfbutton

Mongo db mug_2012-02-07

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Andere mochten auch

Andere mochten auch (7)

Ähnlich wie Mongo db mug_2012-02-07

Ähnlich wie Mongo db mug_2012-02-07 (20)

Mongo db mug_2012-02-07

Hinweis der Redaktion