SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Aggregation
Indexing
Profiling
Rohit Kumar
rohit.kumar@tothenew.com
Manish Kapoor
manish.kapoor@tothenew.com
Agenda
● Recap of insertion and finding documents
● Aggregation Framework
o The Aggregation Pipeline
o Aggregation Operators
o Mapping between SQL and Aggregation
o Limitations of the Aggregation Framework
● Map-reduce Queries
● Performance
o Indexes
o Logging slow queries
o Profiling
RECAP
First things first
Aggregation Pipeline
The MongoDB aggregation pipeline consists of stages. Documents pass
through each stage and output of each stage is input of next stage.
Some stages may generate new documents or filter out documents.
Pipeline stages can appear multiple times in the pipeline.
What is aggregation pipeline?
Aggregation Pipeline
Aggregation Pipeline Stages
Pipeline Stages $group
{$group: {_id: '$city', names: {$push:”$name”}}}
Group
Pipeline Stages $group
{$group: {_id: '$city', count: {'$sum':1}}}
Group
Pipeline Stages $group
{$group: {_id:”$city”, maxAge:{$max:’$age’}}}
Some more accumulator operators
Pipeline Stages $match
{$match: {age:{$gt:20}}}
Pipeline Stages $limit
{$limit: 10}
Pipeline Stages $sort
{$sort: {age:-1}}
Pipeline Stages $unwind
Limitations of aggregation framework
● Result Size Restrictions
● Memory Restrictions
● Limited use of index
● Sharded collections
Exercises
1. Count number of students in group by age
2. Find name of students group by age
3. Find average age of students group by city
4. Count number of students in group by age belonging to
Chennai.
Map Reduce
What Documentation says?
Map-reduce is a data processing paradigm for
condensing large volumes of data into useful
aggregated results. For map-reduce operations,
MongoDB provides the mapReduce database
command.
Learn Map-Reduce by playing cards
How it works in mongo?
Map-Reduce Command
map function
The map function is responsible for transforming each
input document into zero or more documents.
map function
• In the map function, reference the current document as this within the function.
• The map function should not access the database for any reason.
• The map function should be pure, or have no impact outside of the function (i.e. side
effects.)
• The map function may optionally call emit(key,value) any number of times to create
an output document associating key with value.
The following example will call emit either 0 or 1 time.
The
reduce function
reduce function has the following prototype
reduce function
● The reduce function should not access the database, even to perform read
operations.
● The reduce function should not affect the outside system.
● MongoDB will not call the reduce function for a key that has only a single value. The
values argument is an array whose elements are the value objects that are “mapped”
to the key.
● MongoDB can invoke the reduce function more than once for the same key. In this
case, the previous output from the reduce function for that key will become one of
the input values to the next reduce function invocation for that key.
● The reduce function must return an object whose type must be identical to the type
of the value emitted by the map function.
● The reduce function must be idempotent.
Idempotent function
finalize function
The finalize function has the following prototype
It receives as its arguments a key value and the
reducedValue from the reduce function.
Exercises
Index
Query Performance
db.collection.find().explain(verbosity)
Verbosity: "queryPlanner", "executionStats", and "allPlansExecution".
Creating Index:
db.collectionName.createIndex({name: 1})
db.collection.createIndex( <key and index type specification>, <options> )
Index
Single field Index:
db.collectionName.createIndex({name: 1})
db.collection.createIndex( <key and index type specification>, <options> )
Given the following document in the friends collection:
{
"_id" : ObjectId(...),
"name" : "Alice",
"age" : 27
}
The following command creates an index on the name field:
db.friends.createIndex( { "name" : 1 })
Index
Compound Index:
Index structure holds references to multiple fields within a collection’s documents.
Consider following document in people collection:
{ "_id" : ObjectId(...),
"name" : "John Doe",
"dateOfBirth": Date(..)
"address": {
"street": "Main",
"zipcode": "53511",
"state": "WI"
}
}
db.people.createIndex({'name':1, 'dateOfBirth':1})
MongoDB imposes a limit of 31 fields for any compound index.
Index
Compound Index(cntd):
Prefixes in Compound indexes.
Index prefixes are the beginning subsets of indexed fields. For example, consider the following compound index:
{ "item": 1, "location": 1, "stock":1 }
The index has the following index prefixes:
{ item: 1 }
{ item: 1, location: 1 }
What about find({item: ‘Pen’, ‘stock’:{‘$gt’: 20}})
what about fields item and stock?
MongoDB can also use the index to support a query on item and stock fields since item field corresponds to a prefix.
However, the index would not be as efficient in supporting the query as would be an index on only item and stock.
However, MongoDB cannot use the index to support queries that include the following fields since without the item field, none
of the listed fields correspond to a prefix index:
the location field,
the stock field,
the location and stock field,
.
Index
Index on Embedded field:
Consider following document in people collection:
{
"_id" : ObjectId(...),
"name" : "John Doe",
"address": {
"street": "Main",
"zipcode": "53511",
"state": "WI"
}
}
db.people.createIndex({address.zipcode:1})
Index
Index on Embedded document:
Consider following document in people collection:
{
"_id" : ObjectId("5587afde264fb5b4e25b1556"),
"name" : "John Doe",
"address" : {
"city" : "New York",
"state" : "NY"
}
}
{
"_id" : ObjectId("5587affd264fb5b4e25b1557"),
"name" : "John",
"address" : {
"city" : "New Delhi",
"state" : "ND"
}
}
db.people.createIndex({address:1})
Index
Index on Embedded document(cntd.):
Important: When performing equality matches on embedded documents, field order matters.
i.e.
db.people.find({address:{'city': 'New York', state: 'NY'}}) will find result
{ "_id" : ObjectId("5587afde264fb5b4e25b1556"), "name" : "John Doe", "address" : { "city" :
"New York", "state" : "NY" } }
but
db.people.find({address:{'state':'NY', 'city': 'New York'}}) will not find any result
Logging and Killing Slow Queries
● db.currentOp()
● db.killOp(opId)
Profiling
Enabling Profiling:
db.setProfilingLevel(level)
db.setProfilingLevel(level, slowOpThresholdInMillis)
Profiling Levels:
The following profiling levels are available:
0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs
threshold to its log. This is the default profiler level.
1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds.
You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter
command. See the Specify the Threshold for Slow Operations section for more information.
2 - collects profiling data for all database operations.
Profiling
Viewing Profiling Data:
db.system.profile.find()
ns(namespace)
ts(timestamp)
millis
user
To return operations from ‘test; collection for ‘mydb’
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
Profiling
Viewing Profiling Data(cntd.):
To show five slowest queries run after a certain timestamp
db.system.profile.find({"ts": {"$gt": ISODate("2016-04-29T02:48:42.019Z")}},{ts:1, millis:1, command:1,
query:1, ns:1}).sort({millis:-1}).limit(5).pretty()
Questions
Thanks
References
• http://blog.nahurst.com/visual-guide-to-nosql-systems
• https://docs.mongodb.org/manual/reference/database-profiler/
• http://www.slideshare.net/sebprunier/mongodb-aggregation-framework-in-action
• http://image.slidesharecdn.com/nantesmug-mongodbaggregationframework-
150121023633-conversion-gate02/95/mongodb-aggregation-framework-in-action-9-
638.jpg?cb=1421829478
• https://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
• DB script: https://gist.github.com/rohitbishnoi/6e6d9556ba0569c18f805a585029f5f8

Weitere ähnliche Inhalte

Was ist angesagt?

Django를 Django답게, Django로 뉴스 사이트 만들기
Django를 Django답게, Django로 뉴스 사이트 만들기Django를 Django답게, Django로 뉴스 사이트 만들기
Django를 Django답게, Django로 뉴스 사이트 만들기Kyoung Up Jung
 
Spring Framework - AOP
Spring Framework - AOPSpring Framework - AOP
Spring Framework - AOPDzmitry Naskou
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the RoadmapEDB
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDBCésar Trigo
 
AngularJS $http Interceptors (Explanation and Examples)
AngularJS $http Interceptors (Explanation and Examples)AngularJS $http Interceptors (Explanation and Examples)
AngularJS $http Interceptors (Explanation and Examples)Brian Swartzfager
 
Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략Jin wook
 
Aplicaciones geográficas con Django - No solo de Javascript viven los mapas
Aplicaciones geográficas con Django - No solo de Javascript viven los mapasAplicaciones geográficas con Django - No solo de Javascript viven los mapas
Aplicaciones geográficas con Django - No solo de Javascript viven los mapasAlicia Pérez
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to DjangoKnoldus Inc.
 

Was ist angesagt? (20)

Django를 Django답게, Django로 뉴스 사이트 만들기
Django를 Django답게, Django로 뉴스 사이트 만들기Django를 Django답게, Django로 뉴스 사이트 만들기
Django를 Django답게, Django로 뉴스 사이트 만들기
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Spring Framework - AOP
Spring Framework - AOPSpring Framework - AOP
Spring Framework - AOP
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Schema Design
Schema DesignSchema Design
Schema Design
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the Roadmap
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Javascript Exercises
Javascript ExercisesJavascript Exercises
Javascript Exercises
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
AngularJS $http Interceptors (Explanation and Examples)
AngularJS $http Interceptors (Explanation and Examples)AngularJS $http Interceptors (Explanation and Examples)
AngularJS $http Interceptors (Explanation and Examples)
 
Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략
 
Aplicaciones geográficas con Django - No solo de Javascript viven los mapas
Aplicaciones geográficas con Django - No solo de Javascript viven los mapasAplicaciones geográficas con Django - No solo de Javascript viven los mapas
Aplicaciones geográficas con Django - No solo de Javascript viven los mapas
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
MongoDB .local Toronto 2019: Tips and Tricks for Effective Indexing
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 

Andere mochten auch

Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)MongoDB
 
Webinar: The Visual Query Profiler and MongoDB Compass
Webinar: The Visual Query Profiler and MongoDB CompassWebinar: The Visual Query Profiler and MongoDB Compass
Webinar: The Visual Query Profiler and MongoDB CompassMongoDB
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDBNick Court
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBlehresman
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialSteven Francia
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 

Andere mochten auch (11)

Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Webinar: The Visual Query Profiler and MongoDB Compass
Webinar: The Visual Query Profiler and MongoDB CompassWebinar: The Visual Query Profiler and MongoDB Compass
Webinar: The Visual Query Profiler and MongoDB Compass
 
Moving from SQL Server to MongoDB
Moving from SQL Server to MongoDBMoving from SQL Server to MongoDB
Moving from SQL Server to MongoDB
 
Indexing
IndexingIndexing
Indexing
 
MongoDB Workshop
MongoDB WorkshopMongoDB Workshop
MongoDB Workshop
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDB
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Ähnlich wie MongoDB Aggregations Indexing and Profiling

MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases labFabio Fumarola
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analyticsMongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDBKishor Parkhe
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation Amit Ghosh
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineMongoDB
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databasesBinh Le
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patternsjoergreichert
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorHenrik Ingo
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDBNate Abele
 

Ähnlich wie MongoDB Aggregations Indexing and Profiling (20)

Querying mongo db
Querying mongo dbQuerying mongo db
Querying mongo db
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
 
9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab9b. Document-Oriented Databases lab
9b. Document-Oriented Databases lab
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane Fine
 
Latinoware
LatinowareLatinoware
Latinoware
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Nosql part3
Nosql part3Nosql part3
Nosql part3
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
MongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB Europe 2016 - Graph Operations with MongoDB
MongoDB Europe 2016 - Graph Operations with MongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
 

Kürzlich hochgeladen

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 

Kürzlich hochgeladen (20)

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

MongoDB Aggregations Indexing and Profiling

  • 2. Agenda ● Recap of insertion and finding documents ● Aggregation Framework o The Aggregation Pipeline o Aggregation Operators o Mapping between SQL and Aggregation o Limitations of the Aggregation Framework ● Map-reduce Queries ● Performance o Indexes o Logging slow queries o Profiling
  • 4. Aggregation Pipeline The MongoDB aggregation pipeline consists of stages. Documents pass through each stage and output of each stage is input of next stage. Some stages may generate new documents or filter out documents. Pipeline stages can appear multiple times in the pipeline. What is aggregation pipeline?
  • 7. Pipeline Stages $group {$group: {_id: '$city', names: {$push:”$name”}}} Group
  • 8. Pipeline Stages $group {$group: {_id: '$city', count: {'$sum':1}}} Group
  • 9. Pipeline Stages $group {$group: {_id:”$city”, maxAge:{$max:’$age’}}} Some more accumulator operators
  • 14. Limitations of aggregation framework ● Result Size Restrictions ● Memory Restrictions ● Limited use of index ● Sharded collections
  • 15. Exercises 1. Count number of students in group by age 2. Find name of students group by age 3. Find average age of students group by city 4. Count number of students in group by age belonging to Chennai.
  • 17. What Documentation says? Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command.
  • 18. Learn Map-Reduce by playing cards
  • 19. How it works in mongo?
  • 21. map function The map function is responsible for transforming each input document into zero or more documents.
  • 22. map function • In the map function, reference the current document as this within the function. • The map function should not access the database for any reason. • The map function should be pure, or have no impact outside of the function (i.e. side effects.) • The map function may optionally call emit(key,value) any number of times to create an output document associating key with value. The following example will call emit either 0 or 1 time. The
  • 23. reduce function reduce function has the following prototype
  • 24. reduce function ● The reduce function should not access the database, even to perform read operations. ● The reduce function should not affect the outside system. ● MongoDB will not call the reduce function for a key that has only a single value. The values argument is an array whose elements are the value objects that are “mapped” to the key. ● MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key. ● The reduce function must return an object whose type must be identical to the type of the value emitted by the map function. ● The reduce function must be idempotent.
  • 26. finalize function The finalize function has the following prototype It receives as its arguments a key value and the reducedValue from the reduce function.
  • 28. Index Query Performance db.collection.find().explain(verbosity) Verbosity: "queryPlanner", "executionStats", and "allPlansExecution". Creating Index: db.collectionName.createIndex({name: 1}) db.collection.createIndex( <key and index type specification>, <options> )
  • 29. Index Single field Index: db.collectionName.createIndex({name: 1}) db.collection.createIndex( <key and index type specification>, <options> ) Given the following document in the friends collection: { "_id" : ObjectId(...), "name" : "Alice", "age" : 27 } The following command creates an index on the name field: db.friends.createIndex( { "name" : 1 })
  • 30. Index Compound Index: Index structure holds references to multiple fields within a collection’s documents. Consider following document in people collection: { "_id" : ObjectId(...), "name" : "John Doe", "dateOfBirth": Date(..) "address": { "street": "Main", "zipcode": "53511", "state": "WI" } } db.people.createIndex({'name':1, 'dateOfBirth':1}) MongoDB imposes a limit of 31 fields for any compound index.
  • 31. Index Compound Index(cntd): Prefixes in Compound indexes. Index prefixes are the beginning subsets of indexed fields. For example, consider the following compound index: { "item": 1, "location": 1, "stock":1 } The index has the following index prefixes: { item: 1 } { item: 1, location: 1 } What about find({item: ‘Pen’, ‘stock’:{‘$gt’: 20}}) what about fields item and stock? MongoDB can also use the index to support a query on item and stock fields since item field corresponds to a prefix. However, the index would not be as efficient in supporting the query as would be an index on only item and stock. However, MongoDB cannot use the index to support queries that include the following fields since without the item field, none of the listed fields correspond to a prefix index: the location field, the stock field, the location and stock field, .
  • 32. Index Index on Embedded field: Consider following document in people collection: { "_id" : ObjectId(...), "name" : "John Doe", "address": { "street": "Main", "zipcode": "53511", "state": "WI" } } db.people.createIndex({address.zipcode:1})
  • 33. Index Index on Embedded document: Consider following document in people collection: { "_id" : ObjectId("5587afde264fb5b4e25b1556"), "name" : "John Doe", "address" : { "city" : "New York", "state" : "NY" } } { "_id" : ObjectId("5587affd264fb5b4e25b1557"), "name" : "John", "address" : { "city" : "New Delhi", "state" : "ND" } } db.people.createIndex({address:1})
  • 34. Index Index on Embedded document(cntd.): Important: When performing equality matches on embedded documents, field order matters. i.e. db.people.find({address:{'city': 'New York', state: 'NY'}}) will find result { "_id" : ObjectId("5587afde264fb5b4e25b1556"), "name" : "John Doe", "address" : { "city" : "New York", "state" : "NY" } } but db.people.find({address:{'state':'NY', 'city': 'New York'}}) will not find any result
  • 35. Logging and Killing Slow Queries ● db.currentOp() ● db.killOp(opId)
  • 36. Profiling Enabling Profiling: db.setProfilingLevel(level) db.setProfilingLevel(level, slowOpThresholdInMillis) Profiling Levels: The following profiling levels are available: 0 - the profiler is off, does not collect any data. mongod always writes operations longer than the slowOpThresholdMs threshold to its log. This is the default profiler level. 1 - collects profiling data for slow operations only. By default slow operations are those slower than 100 milliseconds. You can modify the threshold for “slow” operations with the slowOpThresholdMs runtime option or the setParameter command. See the Specify the Threshold for Slow Operations section for more information. 2 - collects profiling data for all database operations.
  • 37. Profiling Viewing Profiling Data: db.system.profile.find() ns(namespace) ts(timestamp) millis user To return operations from ‘test; collection for ‘mydb’ db.system.profile.find( { ns : 'mydb.test' } ).pretty() To return operations slower than 5 milliseconds, run a query similar to the following: db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
  • 38. Profiling Viewing Profiling Data(cntd.): To show five slowest queries run after a certain timestamp db.system.profile.find({"ts": {"$gt": ISODate("2016-04-29T02:48:42.019Z")}},{ts:1, millis:1, command:1, query:1, ns:1}).sort({millis:-1}).limit(5).pretty()
  • 41. References • http://blog.nahurst.com/visual-guide-to-nosql-systems • https://docs.mongodb.org/manual/reference/database-profiler/ • http://www.slideshare.net/sebprunier/mongodb-aggregation-framework-in-action • http://image.slidesharecdn.com/nantesmug-mongodbaggregationframework- 150121023633-conversion-gate02/95/mongodb-aggregation-framework-in-action-9- 638.jpg?cb=1421829478 • https://docs.mongodb.org/manual/reference/sql-aggregation-comparison/ • DB script: https://gist.github.com/rohitbishnoi/6e6d9556ba0569c18f805a585029f5f8