SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Application Development Series
Back to Basics
Reporting & Analytics
Daniel Roberts
@dmroberts
#MongoDBBasics
2
• Recap from last session
• Reporting / Analytics options
• Map Reduce
• Aggregation Framework introduction
– Aggregation explain
• mycms application reports
• Geospatial with Aggregation Framework
• Text Search with Aggregation Framework
Agenda
3
• Virtual Genius Bar
– Use the chat to post
questions
– EMEA Solution
Architecture / Support
team are on hand
– Make use of them
during the sessions!!!
Q & A
Recap from last time….
5
Indexing
• Indexes
• Multikey, compound,
„dot.notation‟
• Covered, sorting
• Text, GeoSpatial
• Btrees
>db.articles.ensureIndex( { author
: 1, tags : 1 } )
>db.user.find({user:"danr"}, {_id:0,
password:1})
>db.articles.ensureIndex( {
location: “2dsphere” } )
>>db.articles.ensureIndex(
{ "$**" : “text”,
name : “TextIndex”} )
options db.col.ensureIndex({ key : type})
6
Index performance / efficiency
• Examine index plans
• Identity slow queries
• n / nscanned ratio
• Which index used.
operators .explain() , db profiler
> db.articles.find(
{author:'Dan Roberts‟})
.sort({date:-1}
).explain()
> db.setProfilingLevel(1,
100)
{ "was" : 0, "slowms" : 100,
"ok" : 1 }
> db.system.profile.find()
.pretty()
Reporting / Analytics options
8
• Query Language
– Leverage pre aggregated documents
• Aggregation Framework
– Calculate new values from the data that we have
– For instance : Average views, comments count
• MapReduce
– Internal Javascript based implementation
– External Hadoop, using the MongoDB connector
• A combination of the above
Access data for reporting, options
9
• Immediate results
– Simple from a query
perspective.
– Interactions collection
Pre Aggregated Reports
{
„_id‟ : ObjectId(..),
„article_id‟ : ObjectId(..),
„section‟ : „schema‟,
„date‟ : ISODate(..),
„daily‟: { „views‟ : 45,
„comments‟ : 150 }
„hours‟ : {
0 : { „views‟ : 10 },
1 : { „views‟ : 2 },
…
23 : { „views‟ : 14,
„comments‟ : 10 }
}
}
> db.interactions.find(
{"article_id" : ObjectId(”…..")},
{_id:0, hourly:1}
)
10
• Use query result to display directly in application
– Create new REST API
– D3.js library or similar in UI
Pre Aggregated Reports
{
"hourly" : {
"0" : {
"view" : 1
},
"1" : {
"view" : 1
},
……
"22" : {
"view" : 5
},
"23" : {
"view" : 3
}
}
}
Map Reduce
12
• Map Reduce
– MongoDB – JavaScript
• Incremental Map Reduce
Map Reduce
//Map Reduce Example
> db.articles.mapReduce(
function() { emit(this.author, this.comment_count); },
function(key, values) { return Array.sum (values) },
{
query : {},
out: { merge: "comment_count" }
}
)
Output
{ "_id" : "Dan Roberts", "value" : 6 }
{ "_id" : "Jim Duffy", "value" : 1 }
{ "_id" : "Kunal Taneja", "value" : 2 }
{ "_id" : "Paul Done", "value" : 2 }
13
MongoDB – Hadoop Connector
Hadoop Integration
Primary
Secondary
Secondary
HDFS
Primary
Secondary
Secondary
Primary
Secondary
Secondary
Primary
Secondary
Secondary
HDFS HDFS HDFS
MapReduce MapReduce MapReduce MapReduce
MongoS MongoSMongoS
Application ApplicationApplication
Application
Dash Boards /
Reporting
1) Data Flow,
Input /
Output via
Application
Tier
Aggregation Framework
15
• Multi-stage pipeline
– Like a unix pipe –
• “ps -ef | grep mongod”
– Aggregate data, Transform
documents
– Implemented in the core server
Aggregation Framework
//Find out which are the most popular tags…
db.articles.aggregate([
{ $unwind : "$tags" },
{ $group : { _id : "$tags" , number : { $sum : 1 } } },
{ $sort : { number : -1 } }
])
Output
{ "_id" : "mongodb", "number" : 6 }
{ "_id" : "nosql", "number" : 3 }
{ "_id" : "database", "number" : 1 }
{ "_id" : "aggregation", "number" : 1 }
{ "_id" : "node", "number" : 1 }
16
In our mycms application..
//Our new python example
@app.route('/cms/api/v1.0/tag_counts', methods=['GET'])
def tag_counts():
pipeline = [ { "$unwind" : "$tags" },
{ "$group" : { "_id" : "$tags" , "number" : { "$sum" : 1 } }
},
{ "$sort" : { "number" : -1 } }]
cur = db['articles'].aggregate(pipeline, cursor={})
# Check everything ok
if not cur:
abort(400)
# iterate the cursor and add docs to a dict
tags = [tag for tag in cur]
return jsonify({'tags' : json.dumps(tags, default=json_util.default)})
17
• Pipeline and Expression operators
Aggregation operators
Pipeline
$match
$sort
$limit
$skip
$project
$unwind
$group
$geoNear
$text
$search
Tip: Other operators for date, time, boolean and string manipulation
Expression
$addToSet
$first
$last
$max
$min
$avg
$push
$sum
Arithmetic
$add
$divide
$mod
$multiply
$subtract
Conditional
$cond
$ifNull
Variables
$let
$map
18
• What reports and analytics do we need in our
application?
– Popular Tags
– Popular Articles
– Popular Locations – integration with Geo Spatial
– Average views per hour or day
Application Reports
19
• Unwind each „tags‟ array
• Group and count each one, then Sort
• Output to new collection
– Query from new collection so don‟t need to compute for
every request.
Popular Tags
db.articles.aggregate([
{ $unwind : "$tags" },
{ $group : { _id : "$tags" , number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $out : "tags"}
])
20
• Top 5 articles by average daily views
– Use the $avg operator
– Use use $match to constrain data range
• Utilise with $gt and $lt operators
Popular Articles
db.interactions.aggregate([
{
{$match : { date :
{ $gt : ISODate("2014-02-20T00:00:00.000Z")}}},
{$group : {_id: "$article_id", a : { $avg : "$daily.view"}}},
{$sort : { a : -1}},
{$limit : 5}
]);
21
• Use Explain plan to ensure the efficient use of the
index when querying.
Aggregation Framework Explain
db.interactions.aggregate([
{$group : {_id: "$article_id", a : { $avg : "$daily.view"}}},
{$sort : { a : -1}},
{$limit : 5}
],
{explain : true}
);
22
Explain output…
{
"stages" : [
{
"$cursor" : { "query" : … }, "fields" : { … },
"plan" : {
"cursor" : "BasicCursor",
"isMultiKey" : false,
"scanAndOrder" : false,
"allPlans" : [
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"scanAndOrder" : false
}
]
}
}
},
…
"ok" : 1
}
Geo Spatial & Text Search
Aggregation
24
• $text operator with aggregation framework
– All articles with MongoDB
– Group by author, sort by comments count
Text Search
db.articles.aggregate([
{ $match: { $text: { $search: "mongodb" } } },
{ $group: { _id: "$author", comments:
{ $sum: "$comment_count" } } }
{$sort : {comments: -1}},
])
25
• $geoNear operator with aggregation framework
– Again use geo operator in the $match statement.
– Group by author, and article count.
Utilise with Geo spatial
db.articles.aggregate([
{ $match: { location: { $geoNear :
{ $geometry :
{ type: "Point" ,coordinates : [-0.128, 51.507] } },
$maxDistance :5000}
}
},
{ $group: { _id: "$author", articleCount: { $sum: 1 } } }
])
Summary
27
• Aggregating Data…
– Map Reduce
– Hadoop
– Pre-Aggregated Reports
– Aggregation Framework
• Tune with Explain plan
• Compute on the fly or Compute and store
• Geospatial
• Text Search
Summary
28
– Operations for you application
– Scale out
– Availability
– How do we prepare of production
– Sizing
Next Session – 3th April
1403   app dev series - session 5 - analytics

Weitere ähnliche Inhalte

Was ist angesagt?

Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Anuj Jain
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkTyler Brock
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation FrameworkMongoDB
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamskyData Con LA
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
NoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael HacksteinNoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael Hacksteindistributed matters
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkMongoDB
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015NoSQLmatters
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2MongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipelinezahid-mian
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationJoe Drumgoole
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLReducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLMongoDB
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Stefan Urbanek
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on AndroidSven Haiges
 

Was ist angesagt? (20)

Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
NoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael HacksteinNoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael Hackstein
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
 
MongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced AggregationMongoDB World 2016 : Advanced Aggregation
MongoDB World 2016 : Advanced Aggregation
 
MongoDB With Style
MongoDB With StyleMongoDB With Style
MongoDB With Style
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLReducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQL
 
Talk MongoDB - Amil
Talk MongoDB - AmilTalk MongoDB - Amil
Talk MongoDB - Amil
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on Android
 

Ähnlich wie 1403 app dev series - session 5 - analytics

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBTakahiro Inoue
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsAndrew Morgan
 
Schema Design by Chad Tindel, Solution Architect, 10gen
Schema Design  by Chad Tindel, Solution Architect, 10genSchema Design  by Chad Tindel, Solution Architect, 10gen
Schema Design by Chad Tindel, Solution Architect, 10genMongoDB
 
MongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-stepsMatteo Moci
 
CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010Jonathan Weiss
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010Jonathan Weiss
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases labFabio Fumarola
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_newMongoDB
 
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19Henrik Ingo
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongoMichael Bright
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and PythonMike Bright
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6Maxime Beugnet
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)MongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleMongoDB
 

Ähnlich wie 1403 app dev series - session 5 - analytics (20)

MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
 
Schema Design by Chad Tindel, Solution Architect, 10gen
Schema Design  by Chad Tindel, Solution Architect, 10genSchema Design  by Chad Tindel, Solution Architect, 10gen
Schema Design by Chad Tindel, Solution Architect, 10gen
 
MongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB World 2018: Keynote
MongoDB World 2018: Keynote
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 
CouchDB on Rails
CouchDB on RailsCouchDB on Rails
CouchDB on Rails
 
CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
 
2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new2012 mongo db_bangalore_roadmap_new
2012 mongo db_bangalore_roadmap_new
 
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
Whats new in mongoDB 2.4 at Copenhagen user group 2013-06-19
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

1403 app dev series - session 5 - analytics

  • 1. Application Development Series Back to Basics Reporting & Analytics Daniel Roberts @dmroberts #MongoDBBasics
  • 2. 2 • Recap from last session • Reporting / Analytics options • Map Reduce • Aggregation Framework introduction – Aggregation explain • mycms application reports • Geospatial with Aggregation Framework • Text Search with Aggregation Framework Agenda
  • 3. 3 • Virtual Genius Bar – Use the chat to post questions – EMEA Solution Architecture / Support team are on hand – Make use of them during the sessions!!! Q & A
  • 4. Recap from last time….
  • 5. 5 Indexing • Indexes • Multikey, compound, „dot.notation‟ • Covered, sorting • Text, GeoSpatial • Btrees >db.articles.ensureIndex( { author : 1, tags : 1 } ) >db.user.find({user:"danr"}, {_id:0, password:1}) >db.articles.ensureIndex( { location: “2dsphere” } ) >>db.articles.ensureIndex( { "$**" : “text”, name : “TextIndex”} ) options db.col.ensureIndex({ key : type})
  • 6. 6 Index performance / efficiency • Examine index plans • Identity slow queries • n / nscanned ratio • Which index used. operators .explain() , db profiler > db.articles.find( {author:'Dan Roberts‟}) .sort({date:-1} ).explain() > db.setProfilingLevel(1, 100) { "was" : 0, "slowms" : 100, "ok" : 1 } > db.system.profile.find() .pretty()
  • 8. 8 • Query Language – Leverage pre aggregated documents • Aggregation Framework – Calculate new values from the data that we have – For instance : Average views, comments count • MapReduce – Internal Javascript based implementation – External Hadoop, using the MongoDB connector • A combination of the above Access data for reporting, options
  • 9. 9 • Immediate results – Simple from a query perspective. – Interactions collection Pre Aggregated Reports { „_id‟ : ObjectId(..), „article_id‟ : ObjectId(..), „section‟ : „schema‟, „date‟ : ISODate(..), „daily‟: { „views‟ : 45, „comments‟ : 150 } „hours‟ : { 0 : { „views‟ : 10 }, 1 : { „views‟ : 2 }, … 23 : { „views‟ : 14, „comments‟ : 10 } } } > db.interactions.find( {"article_id" : ObjectId(”…..")}, {_id:0, hourly:1} )
  • 10. 10 • Use query result to display directly in application – Create new REST API – D3.js library or similar in UI Pre Aggregated Reports { "hourly" : { "0" : { "view" : 1 }, "1" : { "view" : 1 }, …… "22" : { "view" : 5 }, "23" : { "view" : 3 } } }
  • 12. 12 • Map Reduce – MongoDB – JavaScript • Incremental Map Reduce Map Reduce //Map Reduce Example > db.articles.mapReduce( function() { emit(this.author, this.comment_count); }, function(key, values) { return Array.sum (values) }, { query : {}, out: { merge: "comment_count" } } ) Output { "_id" : "Dan Roberts", "value" : 6 } { "_id" : "Jim Duffy", "value" : 1 } { "_id" : "Kunal Taneja", "value" : 2 } { "_id" : "Paul Done", "value" : 2 }
  • 13. 13 MongoDB – Hadoop Connector Hadoop Integration Primary Secondary Secondary HDFS Primary Secondary Secondary Primary Secondary Secondary Primary Secondary Secondary HDFS HDFS HDFS MapReduce MapReduce MapReduce MapReduce MongoS MongoSMongoS Application ApplicationApplication Application Dash Boards / Reporting 1) Data Flow, Input / Output via Application Tier
  • 15. 15 • Multi-stage pipeline – Like a unix pipe – • “ps -ef | grep mongod” – Aggregate data, Transform documents – Implemented in the core server Aggregation Framework //Find out which are the most popular tags… db.articles.aggregate([ { $unwind : "$tags" }, { $group : { _id : "$tags" , number : { $sum : 1 } } }, { $sort : { number : -1 } } ]) Output { "_id" : "mongodb", "number" : 6 } { "_id" : "nosql", "number" : 3 } { "_id" : "database", "number" : 1 } { "_id" : "aggregation", "number" : 1 } { "_id" : "node", "number" : 1 }
  • 16. 16 In our mycms application.. //Our new python example @app.route('/cms/api/v1.0/tag_counts', methods=['GET']) def tag_counts(): pipeline = [ { "$unwind" : "$tags" }, { "$group" : { "_id" : "$tags" , "number" : { "$sum" : 1 } } }, { "$sort" : { "number" : -1 } }] cur = db['articles'].aggregate(pipeline, cursor={}) # Check everything ok if not cur: abort(400) # iterate the cursor and add docs to a dict tags = [tag for tag in cur] return jsonify({'tags' : json.dumps(tags, default=json_util.default)})
  • 17. 17 • Pipeline and Expression operators Aggregation operators Pipeline $match $sort $limit $skip $project $unwind $group $geoNear $text $search Tip: Other operators for date, time, boolean and string manipulation Expression $addToSet $first $last $max $min $avg $push $sum Arithmetic $add $divide $mod $multiply $subtract Conditional $cond $ifNull Variables $let $map
  • 18. 18 • What reports and analytics do we need in our application? – Popular Tags – Popular Articles – Popular Locations – integration with Geo Spatial – Average views per hour or day Application Reports
  • 19. 19 • Unwind each „tags‟ array • Group and count each one, then Sort • Output to new collection – Query from new collection so don‟t need to compute for every request. Popular Tags db.articles.aggregate([ { $unwind : "$tags" }, { $group : { _id : "$tags" , number : { $sum : 1 } } }, { $sort : { number : -1 } }, { $out : "tags"} ])
  • 20. 20 • Top 5 articles by average daily views – Use the $avg operator – Use use $match to constrain data range • Utilise with $gt and $lt operators Popular Articles db.interactions.aggregate([ { {$match : { date : { $gt : ISODate("2014-02-20T00:00:00.000Z")}}}, {$group : {_id: "$article_id", a : { $avg : "$daily.view"}}}, {$sort : { a : -1}}, {$limit : 5} ]);
  • 21. 21 • Use Explain plan to ensure the efficient use of the index when querying. Aggregation Framework Explain db.interactions.aggregate([ {$group : {_id: "$article_id", a : { $avg : "$daily.view"}}}, {$sort : { a : -1}}, {$limit : 5} ], {explain : true} );
  • 22. 22 Explain output… { "stages" : [ { "$cursor" : { "query" : … }, "fields" : { … }, "plan" : { "cursor" : "BasicCursor", "isMultiKey" : false, "scanAndOrder" : false, "allPlans" : [ { "cursor" : "BasicCursor", "isMultiKey" : false, "scanAndOrder" : false } ] } } }, … "ok" : 1 }
  • 23. Geo Spatial & Text Search Aggregation
  • 24. 24 • $text operator with aggregation framework – All articles with MongoDB – Group by author, sort by comments count Text Search db.articles.aggregate([ { $match: { $text: { $search: "mongodb" } } }, { $group: { _id: "$author", comments: { $sum: "$comment_count" } } } {$sort : {comments: -1}}, ])
  • 25. 25 • $geoNear operator with aggregation framework – Again use geo operator in the $match statement. – Group by author, and article count. Utilise with Geo spatial db.articles.aggregate([ { $match: { location: { $geoNear : { $geometry : { type: "Point" ,coordinates : [-0.128, 51.507] } }, $maxDistance :5000} } }, { $group: { _id: "$author", articleCount: { $sum: 1 } } } ])
  • 27. 27 • Aggregating Data… – Map Reduce – Hadoop – Pre-Aggregated Reports – Aggregation Framework • Tune with Explain plan • Compute on the fly or Compute and store • Geospatial • Text Search Summary
  • 28. 28 – Operations for you application – Scale out – Availability – How do we prepare of production – Sizing Next Session – 3th April

Hinweis der Redaktion

  1. db.interactions.find({"article_id" : ObjectId("532198379fb5ba99a6bd4063")})db.interactions.find({"article_id" : ObjectId("532198379fb5ba99a6bd4063")},{_id:0,hourly:1})