Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

MongoDBEurope2016
Old Billingsgate, London
15th November
Use my code rubenterceno20 for 20% off tickets
mongodb.com/europe

Conceptos Básicos 2016
Introducción al Aggregation Framework
Rubén Terceño
Senior Solutions Architect, EMEA
ruben@mongodb.com
@rubenTerceno

Agenda del Curso
Date Time Webinar
25-Mayo-2016 16:00 CEST Introducción a NoSQL
7-Junio-2016 16:00 CEST Su primera aplicación MongoDB
21-Junio-2016 16:00 CEST Diseño de esquema orientado a documentos
07-Julio-2016 16:00 CEST Indexación avanzada, índices de texto y geoespaciales
19-Julio-2016 16:00 CEST Introducción al Aggregation Framework
28-Julio-2016 16:00 CEST Despliegue en producción

Resumen de lo visto hasta ahora
• ¿Porqué existe NoSQL?
• Tipos de bases de datos NoSQL
• Características clave de MongoDB
• Instalación y creación de bases de datos y colecciones
• Operaciones CRUD, Índices y explain()
• Diseño de esquema dinámico
• Jerarquía y documentos embebidos
• Búsquedas de texto libre y geoespaciales

Aggregation Framework
• Un motor analítico nativo para MongoDB
• Pero… que significa analítico?
• Si miramos a las BBDD clásicas tenemos dos tipos, OLTP y OLAP
• OLTP : Online Transaction Processing
• Reservas de aviones
• Operativa de cajeros
• Gestión de clientes(CRM)
• OLAP : Online Analytical Processing
• Cálculos de rentabilidad y Overbooking
• Predicción de demanda y optimización y de recargasd e cajerois
• Segmentación de clientes

OLAP – Territorio de Gigantes
• Las queries OLAP normalmente requiren accesos totales los datos
• Los resultados se almacenan para análisis comparativos y futuros
• Spark y Hadoop son las tecnologías dominantes en esta área, pero:
• La complejidad es elevada
• Están orientados al análisis algorítmico de datos (Hay que programar)
• Requieren conocimiento de procesado y algorítmica paralela.
• Aggregation Framework ofrece aproximación más amistosa 
• Puedes hacer lo mismo con menos esfuerzo.
• Óptimo para analítica en Tiempo Real y descubrimiento

Agg. Frmwk – A Processing Pipeline
Project Lookup Group SortMatch
• Think unix pipeline
• The output of one stage is passed to the input of the next stage
• Each stage performs one job
• Stages can be repeated
• The input is a single collection

Pipeline Operators
• $match
Filter documents
• $project/$redact
Reshape documents
• $group
Summarize documents
• $out
Create new collections
• $sample
Return random samples
• $sort
Order documents
• $limit/$skip
Paginate documents
• $lookup
Join two collections together
• $unwind
Expand an array
• $geoNear
Return documents by distance

Model of the Aggregation Framework

The Containers dataset
• https://github.com/terce13/geoData/blob/master/Containers.zip
• Unzip it
• mongorestore ./dump

Example ship document
{ "_id" : ObjectId("56fda36a0a162d0f051f2c6d"),
"Built" : 2015,
"Name" : "MSC Zoe",
"Length overall (m)" : 395.4,
"Beam (m)" : 59,
"Maximum TEU" : 19224,
"GT" : 193000,
"Owner" : "MSC",
"Country" : "Switzerland",
"route" : {
"origin" : {
"Name" : "Tianjin",
"Country" : "China”},
"destination" : {
"Name" : "Shanghai",
"Country" : "China”}},
"location" : {
"type" : "Point",
"coordinates" : [
129.15693498213182, 18.108558232731916]},
"EAT" : ISODate("2016-05-16T10:00:00Z”)}

Example container document
{
"_id" : ObjectId("5719290546728347c6fbdc4c"),
"container_id" : "00000001",
"type" : "40",
"cargo" : "Whales",
"Tons" : 38,
"location" : {
"type" : "Point",
"coordinates" : [
129.15297142372992,
18.108451503053704
]
},
"shipName" : "MSC Zoe"
}

Using the shell
MongoDB Enterprise > db.ships.aggregate()
{ "_id" : ObjectId("56fda36a0a162d0f051f2c6d"), "Built" : 2015, "Name" : "MSC Zoe",
"Length overall (m)" : 395.4, "Beam (m)" : 59, "Maximum TEU" : 19224, "GT" : 193000,
"Owner" : "MSC", "Country" : "Switzerland", "route" : { "origin" : { "Name" :
"Tianjin", "Country" : "China" }, "destination" : { "Name" : "Shanghai", "Country" :
"China" } }, "location" : { "type" : "Point", "coordinates" : [ 129.15693498213182,
18.108558232731916 ] }, "EAT" : ISODate("2016-05-16T10:00:00Z") }
[...]
{ "_id" : ObjectId("56fda36a0a162d0f051f2c7f"), "Built" : 2015, "Name" : "CMA CGM
Bougainv", "Length overall (m)" : 398, "Beam (m)" : 54, "Maximum TEU" : 17722, "GT"
: "", "Owner" : "CMA CGM", "Country" : "France", "route" : { "origin" : { "Name" :
"Cartagena", "Country" : "Colombia" }, "destination" : { "Name" : "Antwerp",
"Country" : "Belgium" } }, "location" : { "type" : "Point", "coordinates" : [ -
80.76828572004653, 0.8313138025242637 ] }, "EAT" : ISODate("2016-05-17T11:00:00Z") }
Type "it" for more
MongoDB Enterprise >

$limit
MongoDB Enterprise > db.ships.aggregate([{$limit : 2}])
{ "_id" : ObjectId("56fda36a0a162d0f051f2c6d"), "Built" : 2015, "Name" : "MSC Zoe",
"Length overall (m)" : 395.4, "Beam (m)" : 59, "Maximum TEU" : 19224, "GT" : 193000,
"Owner" : "MSC", "Country" : "Switzerland", "route" : { "origin" : { "Name" :
"Tianjin", "Country" : "China" }, "destination" : { "Name" : "Shanghai", "Country" :
"China" } }, "location" : { "type" : "Point", "coordinates" : [ 129.15693498213182,
18.108558232731916 ] }, "EAT" : ISODate("2016-05-16T10:00:00Z") }
{ "_id" : ObjectId("56fda36a0a162d0f051f2c70"), "Built" : 2015, "Name" : "MSC
Oscar", "Length overall (m)" : 395.4, "Beam (m)" : 59, "Maximum TEU" : 19224, "GT" :
192237, "Owner" : "MSC", "Country" : "Switzerland", "route" : { "origin" : { "Name"
: "Kaohsiung", "Country" : "Taiwan" }, "destination" : { "Name" : "Shanghai",
"Country" : "China" } }, "location" : { "type" : "Point", "coordinates" : [
153.87348512279215, 44.683039336234614 ] }, "EAT" : ISODate("2016-05-25T09:00:00Z")
}

$skip
MongoDB Enterprise > db.ships.aggregate([{$limit : 7}, {$skip: 5}])
{ "_id" : ObjectId("56fda36a0a162d0f051f2c72"), "Built" : 2014, "Name" : "CSCL
Pacific Ocean", "Length overall (m)" : 399.67, "Beam (m)" : 58.6, "Maximum TEU" :
19100, "GT" : 187541, "Owner" : "CSCL", "Country" : "China", "route" : { "origin" :
{ "Name" : "Ningbo-Zhoushan", "Country" : "China" }, "destination" : { "Name" :
"Kaohsiung", "Country" : "Taiwan" } }, "location" : { "type" : "Point",
"coordinates" : [ -168.88031791679694, 41.72272856411992 ] }, "EAT" : ISODate("2016-
05-24T04:00:00Z") }
{ "_id" : ObjectId("56fda36a0a162d0f051f2c73"), "Built" : 2015, "Name" : "CSCL
Indian Ocean", "Length overall (m)" : 399.67, "Beam (m)" : 58.6, "Maximum TEU" :
19100, "GT" : 187541, "Owner" : "CSCL", "Country" : "China", "route" : { "origin" :
{ "Name" : "Jebel Ali (Dubai)", "Country" : "United Arab Emirates" }, "destination"
: { "Name" : "Ho Chi Minh City (Saigon)", "Country" : "Vietnam" } }, "location" : {
"type" : "Point", "coordinates" : [ -136.48534585524587, 27.322294568378965 ] },
"EAT" : ISODate("2016-05-27T12:00:00Z") }

$sample
{$sample: {size : 10}}

$match
MongoDB Enterprise > db.ships.aggregate([{$match :{"Country": "China"}}])
MongoDB Enterprise > db.ships.aggregate([{$match :{"route.origin.Country": "China"}}])
MongoDB Enterprise > db.ships.aggregate([{$match : {location: {$geoWithin: {$geometry
: caribe.geometry}}}}])

$geoWithin
db.ships.aggregate([
{
$geoNear: {
near: { type: "Point", coordinates: [ -122.4252, 37.8283 ] },
distanceField: "dist.calculated",
maxDistance: 3000000,
distanceMultiplier: 0.001,
query: { cargo: "Iron" },
limit : 1000000,
includeLocs: "dist.location",
spherical: true
}
}])

$lookup
{$lookup : {from: "containers", as: "cargo",
localField: "Name", foreignField: "shipName"}}

$group
{$group : {_id: {ship: "$Name", cargo :
"$cargo.cargo", route: "$route", location:
"$location"}, sum: {$sum: "$cargo.Tons"}}}

$project
var project = {$project: {_id : {ship:
"$_id.ship", route: "$_id.route", location:
"$_id.location"}, cargo : {type : "$_id.cargo",
Tons: "$sum"}}}

Summary
• A pipeline of operations
• Select, project, group, sort, lookup
• $out must appear last in an aggregation pipeline
• There are a range of accumulators (see the group by
documentation)
• Very powerful way to reshape and analyze data
• Shard aware to gain maximum performance for large clusters

Próximo Webinar
Despliegue en producción
• 28 de Julio 2016 – 16:00 CEST, 11:00 ART, 9:00
• ¡Regístrese si aún no lo ha hecho!
• ¿Qué necesita saber para asegurarse de que el sistema MongoDB
funcione y escale en un entorno de producción?
• En esta charla, haremos un recorrido por nuestro decálogo para el
despliegue en producción y analizaremos los aspectos básicos de
algunas de las herramientas automatizadas que MongoDB ofrece para
gestionar los sistemas en producción.
• Regístrese en : https://www.mongodb.com/webinars
• Denos su opinión, por favor: back-to-basics@mongodb.com

Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

Ähnlich wie Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework (20)

Mehr von MongoDB

Mehr von MongoDB (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework

Hinweis der Redaktion