SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Getting Started with
Geospatial Data in MongoDB
Buzz Moschetti
Enterprise Architect
buzz.moschetti@mongodb.com
@buzzmoschetti
2
Agenda
• What is MongoDB?
• What does “geospatial capabilities” mean?
• GeoJSON
• Combining GeoJSON with non-geo data
• APIs and Use Cases
• Comparison to OGC (Open Geospatial Consortium)
• Indexing
• Using Geo Capabilities for non-Geo Things
• Esri and shapefiles
3
MongoDB:
The Post-Relational General Purpose Database
Document
Data Model
Open-
Source
Fully Featured
High Performance
Scalable
{
name: “John Smith”,
pfxs: [“Dr.”,”Mr.”],
address: “10 3rd St.”,
phone: {
home: 1234567890,
mobile: 1234568138 }
}
Nexus Architecture
Scalability
& Performance
Always On,
Global Deployments
Flexibility
Expressive Query Language
& Secondary Indexes
Strong Consistency
Enterprise Management
& Integrations
5
MongoDB Company Overview
~800 employees 2500+ customers
Over $311 million in funding
Offices in NY & Palo Alto and
across EMEA, and APAC
6
What is “Geo”?
At least 4 levels of capability
7
The Geo Stack
Efficiently store, query, and index arbitrary points, lines
and polygons in the DB
8
The Geo Stack
Platform for data analysis of peer data (trades/house
value/population/sales/widgets) grouped by geo data
Efficiently store, query, and index arbitrary points, lines
and polygons in the DB
9
The Geo Stack
Graphical rendering of geo shapes on a map
Platform for data analysis of peer data (trades/house
value/population/sales/widgets) grouped by geo data
Efficiently store, query, and index arbitrary points, lines
and polygons in the DB
10
The Geo Stack
Application(s) to browse and manipulate all the data
Graphical rendering of geo shapes on a map
Platform for data analysis of peer data (trades/house
value/population/sales/widgets) grouped by geo data
Efficiently store, query, and index arbitrary points, lines
and polygons in the DB
11
Important: Sometimes there is NO Map
• Geo stack must support geo functions WITHOUT a Map
• Offline reporting
• “Nightly fleet management report”
• “Distributor loss by assigned area”
• Compute/analytical processing
• Dynamic polygon generation
• Weather catastrophe simulation
• Other geo-filtering as input to analytics
12
MongoDB: The Data “Base”
Application(s) to browse and visualize
Graphical rendering of geo shapes on a map
Platform for data analysis of peer data
(trades/value/population/sales/widgets) grouped by
geo data
Efficiently store, query, and index arbitrary points,
lines, and polygons in the DB
Peer Data
Geo Data
13
One Persistor for All Applications & Use Cases
Google Map APIs
Browser / Mobile
Other Javascript
Peer Data
Geo Data
MongoDB node.js Driver
MongoDB python
Driver
Quant / Analytics
with pandas
Web Service Code
MongoDB Java
Driver
Nightly Reporting
Enterprise
Integration
MongoDB BI
Connector
Tableau
ClickView
PowerBI
Talend
Mule
Informatica
14
Balanced Reporting
• Most other NoSQL DBs do not have this capability
• Oracle, Postgres, MySQL, SQLServer do offer it and
subscribe to Open GeoS Consortium (OGC) standards
15
MongoDB data model is the major difference
MongoDB: Simple, parse-free, type-correct APIs and data to
manipulate and interrogate geo shapes
a.k.a. arrays (of arrays (of arrays))
OpenGIS: Piles of “ST_” functions:
http://postgis.net/docs/reference.html#Geometry_Accessors
SELECT ST_MakePolygon(
ST_GeomFromText(
'LINESTRING(75.15 29.53,77 29,77.6 29.5, 75.15 29.53)'));
16
Data & APIs
17
Legacy: 2D points
{
name: {f: “Buzz”, l: “Moschetti”},
favoritePets: [ “dog”, “cat” ],
house: [ -95.12345, 43.23423 ]
}
18
Better: GeoJSON
{
name: {f: “Buzz”, l: “Moschetti”},
favoritePets: [ “dog”, “cat” ],
house: {
type: “Point”,
coordinates: [ -95.12345, 43.23423 ]
}
}
19
Better: GeoJSON
{
name: “Superfund132”,
location: {
type: “Polygon”,
coordinates: [
[ [-95.12345, 43.2342],[-95.12456,43.2351],…]
[ [-92.8381, 43.75], … ] // “hole”
]
}
}
20
The GeoJSON Family
{
type: “Point”, “MultiPoint”, “LineString”,”MultiLineString”, “Polygon”,
“MultiPolygon”
coordinates: [ specific to type ]
}
{
type: “GeometryCollection”
geometries: [
{ type: (one of above),
coordinates: [ . . . ]
}
]
NO COMPUTED SHAPES
(Circle, Arc, Box, etc.)
We use the WGS84 standard:
http://spatialreference.org/ref/epsg/4326/
21
MongoDB Data Types are Geo-friendly
var poly = [
[ [-95.12345,43.2342],[-95.12345,43.2351],
[-95.12211,43.2351],[-95.12211,43.2342],
[-95.12345,43.2342] // close the loop!
]
];
db.myCollection.insert(
{name: {f: "Buzz", l: "Moschetti"},
favoritePets: ["dog", "cat"],
geo: { type: "Polygon“, coordinates: poly }
}));
22
… even with Java
Document doc = new Document();
doc.put("name", ”Superfund132");
List ring = new ArrayList();
addPoint(ring, -95.12345, 43.2342);
addPoint(ring, -95.12345, 43.2351);
addPoint(ring, -95.12211, 43.2351);
addPoint(ring, -95.12211, 43.2342);
addPoint(ring, -95.12345, 43.2342);
List poly = new ArrayList();
poly.add(ring);
Map mm = new HashMap();
mm.put("type", "Polygon");
mm.put("coordinates", poly);
doc.put("geo", mm);
coll.insertOne(doc);
static void addPoint(List ll,
double lng,
double lat) {
List pt = new ArrayList();
pt.add(lng);
pt.add(lat);
ll.add(pt);
}
23
All Types Are Preserved Correctly
Document doc = coll.find().first();
recursiveWalk(doc);
name: java.lang.String: Superfund132
geo: com.mongodb.BasicDBObject
type: java.lang.String: Polygon
coordinates: com.mongodb.BasicDBList
0: com.mongodb.BasicDBList
0: com.mongodb.BasicDBList
0: java.lang.Double: -95.12345
1: java.lang.Double: 43.2342
1: com.mongodb.BasicDBList
0: java.lang.Double: -95.12345
1: java.lang.Double: 43.2351
2: com.mongodb.BasicDBList
0: java.lang.Double: -95.12211
1: java.lang.Double: 43.2351
24
Comparison to “Good” PostGIS
import org.postgis.PGgeometry; // extended from org.postgresql.util.PGobject
((org.postgresql.Connection)conn).addDataType("geometry","org.postgis.PGgeometry"
)
String sql = "select geom from someTable”;
ResultSet r = stmt.executeQuery(sql);
while( r.next() ) {
PGgeometry geom = (PGgeometry)r.getObject(1);
if( geom.getType() = Geometry.POLYGON ) {
Polygon pl = (Polygon)geom.getGeometry();
for( int r = 0; r < pl.numRings(); r++) {
LinearRing rng = pl.getRing(r);
. . .
}
}
25
Comparison to most OpenGIS
String sql = "select ST_AsText(geom) from someTable”;
ResultSet r = stmt.executeQuery(sql);
while( r.next() ) {
String wkt = r.getString(1);
// wkt is ”POLYGON((0 0,0 1,1 1,1 0,0 0))”
// http://en.wikipedia.org/wiki/Well-known_text
// Now we have to parse the string into
// an array of array of doubles.
// Don’t want to introduce a 3rd party dependency…
// So . . . We write our own parser.
}
26
Checkpoint
We have data in and out of the DB using basic operations
(insert and find)
Now we need to make it performant!
27
Indexing
collection.createIndex({loc:”2d”})
When to use:
• Your database has legacy location data from MongoDB 2.2 or earlier
• You do not intend to store any location data as GeoJSON objects
• “Special Use Cases” e.g. arbitrary two numeric dimension indexing
collection.createIndex({loc:”2dsphere”})
When to use:
• Supports all GeoJSON objects and legacy [x,y] pairs
collection.createIndex({loc:”geoHaystack”})
When to use:
• Special small area flat (planar) lookup optimization
28
Indexing
collection.createIndex({loc:”2d”})
When to use:
• Your database has legacy location data from MongoDB 2.2 or earlier
• You do not intend to store any location data as GeoJSON objects
• “Special Use Cases” e.g. arbitrary two numeric dimension indexing
collection.createIndex({loc:”2dsphere”})
When to use:
• Supports all GeoJSON objects and legacy [x,y] pairs
collection.createIndex({loc:”geoHaystack”})
When to use:
• Special small area flat (planar) lookup optimization
29
find()/$match and Indexing
Operator Geometry Arg Type 2d 2dsphere
$geoWithin $box,$center,$polygon Y N
$geometry: { type, coordinates } N Y
$centerSphere: [ [x,y], radians ] Y Y
$geoIntersects $geometry only N Y
$near,$nearSphere [x,y] R -
(output sorted by distance) $geometry: {type, coordinates} - R
+ $minDistance N Y
+ $maxDistance Y Y
Y = will assist
N = will not assist
R = REQUIRED
Syntax helper:
find(“loc”:{$geoWithin: {$box: [ [x0,y0], [x1,y2] }});
find(“loc”:{$geoWithin: {$geometry: { type: “Polygon”, coordinates: [ …. ] }}} );
30
Aggregation Framework: $geoNear
Option 2D 2dsphere
$geoNear
(output sorted by distance)
near: { type: “Point”, coordinates } - R - and
spherical:true
near: [ x, y ] R (or) R
query: { expression INCL geo find()
on previous page EXCEPT $near}
N N
Y = will assist
N = will not assist
R = REQUIRED
Important Considerations:
1. You can only use $geoNear as the first stage of a pipeline.
2. You must include the distanceField option.
3. The collection must have only one geospatial index: one 2d index or one 2dsphere index.
4. You do not need to specify which field in the documents hold the coordinate pair or point.
Because $geoNear requires that the collection have a single geospatial index, $geoNear
implicitly uses the indexed field.
31
Use Cases
How do we bring data representation, fast lookup, and . . .
32
Case #1: Find Things in a Given Area + More
• Docs contain Points (or possibly “small” polygons)
• $geoWithin
db.site.aggregate([
{$match: { "loc": { $geoWithin: { $geometry:
{ type: "Polygon", coordinates: [ coords ] }}}
,"portfolio_id": portid
,“insuredValue“: {$gt: 1000000}
,“insuredDate“: {$gt: new ISODate(„2016-01-01“) }}
,{$bucket: {groupBy: „$insuredValue“,
boundaries: [ 1000000, 2000000, 5000000, 10000000,
20000000, Infinity] }}
. . .
33
Case #2: Find Things in an Area Stored in DB
• Get the shape from the “shapes” collection via query:
db.shapes.findOne({predicate},{theShape:1});
• Turn around and query the target collection, e.g.
buildingSites with shape:
db.buildingSites.find({loc:{$geoWithin: theShape}})
34
Case #3: Find Things Closest to where I am
db.buildingSites.aggregate([{$geoNear: { point … }});
• Results returned already in sorted order by closeness
35
Case #3.5: Find Things Closest to where I am but
within some bounds
• db.buildingSites.aggregate([
{$geoNear: {
query: { “loc”: {$geoWithin:
{$centerSphere: … }} }
(or)
query: { “loc”: {$geoWithin: {$geometry:
GeoJSON }} }
}} ])
36
When the Database isn’t enough
37
When the Database isn’t enough
• VERY fast intersection/within for many objects given
probes at high velocity (10000s/sec).
• Geo manipulation: unions, deltas, layering
• Dynamic/programmatic geo construction
• Advanced features: smoothing, simplifiers, centroids,
…
38
You Need Three Things
• Basic geo objects
• Geo operators like intersects, within, etc.
• Algos and smoothers, etc.
39
com.vividsolutions.jts
Map m = (Map) dbo.get("loc"); // get a GeoJSON object from MongoDB
List coords = (List) m.get("coordinates");
List outerRing = (List) coords.get(0); // polygon is array of array of pts
CoordinateSequence pseq = new CoordinateGeoJSONSequence(outerRing, true);
LinearRing outerlr = new LinearRing(pseq, gf);
int numHoles = coords.size() - 1; // -1 to skip outer ring;
LinearRing[] holes = null;
if(numHoles > 0) {
holes = new LinearRing[numHoles];
for(int k = 0; k < numHoles; k++) {
List innerRing = (List) coords.get(k+1); // +1 adj for outer ring
CoordinateSequence psx = new CoordinateGeoJSONSequence(innerRing, true);
holes[k] = new LinearRing(psx, gf);
}
}
Polygon poly1 = new Polygon(outerlr, holes, gf); // ok if holes was null
Point pt1 = new Point(pseq2, gf);
boolean a = pt1.intersects(poly1);
Geometry simplified = TopologyPreservingSimplifier.simplify(poly1, tolerance);
40
The Ecosystem
• OpenGeo runs over MongoDB!
http://suite.opengeo.org/docs/latest/dataadmin/mongodb/index.html
• BoundlessGeo: Commerical support for OpenGeo over MongoDB
* Provides top 2 tiers (viz, analysis)
* https://boundlessgeo.com
41
Geo Capabilities beyond “Simple Geo”
42
Geo as Date Range
{
who: ‘john’
where: ‘mongodb’
what: ‘lightning talk’
start: ISODate(“2016-06-30T15:00:00”)
end: ISODate(“2016-06-30T15:05:00”)
}
What events were happening at 15:03?
collection.find({
start : { $lte:ISODate(“2016-06-30T15:05:03”)},
end: { $gte:ISODate(“2016-06-30T15:05:03”)}
})
43
Geo as Date Range
• Ranges on 2 attributes – Two BTree walks (intersection)
• Assuming time can be anywhere in range of records, index walk is average 50% of index
• Test: Macbook Pro, i5, 16GB RAM, data fits in WT Cache easily. Warmed up. Average of
100 runs
694ms /query using index
487ms /query – COLLSCAN!
(StartDate,EndDate)  Range Type using Geo
LineString “Query”
longitude = t
Lat=“quantized”variable
0
1
2
{
who: ‘john’
where: ‘mongodb’
what: ‘lightning talk’
start: ISODate(“2016-06-30T15:00:00”)
end: ISODate(“2016-06-30T15:05:00”)
se_equiv: { type: “LineString”,
coordinates: [[ -123.1232, 1], [-121.6253, 1]]
}
45
Over 10X performance improvement!
start2 = (((start / yearsecs) - 45) *90) – 90
end2 = (((end / yearsecs) - 45) *90) – 90
event = { type: "LineString", coordinates: [ [ start2, 1 ], [end2, 1 ] ] }
// dx = is the LineString “query”
query = {
g: {
$geoIntersects: {
$geometry: { type: "LineString",
coordinates: [ [ dx, 0 ], [dx, 2 ] ] }
}
}
} 45ms!

46
Mr. Smiley
db.participatingVendors.aggregate([
// Stage 1: The Bounding Box
{$match: { "loc": { $geoWithin: { $geometry:
{ type: "Polygon", coordinates: mapBoundingBox}}}
}}
// Stage 2: Compute distance from Mr. Smiley to the points: Pythagorean theorem:
,{$addFields: {dist: {$sqrt: {$add: [
{$pow:[ {$subtract: [ {$arrayElemAt:[ "$loc.coordinates",0]}, mslng]} , 2]}
,{$pow:[ {$subtract:[ {$arrayElemAt:[ "$loc.coordinates", 1]}, mslat]} , 2]}
]}
}}}
// Stage 3: If the distance between points is LTE the sum of the radii, then
// Mr. Smiley's circle intersects that of the participant:
// Project 0 (no touch) or 1 (touch)
,{$addFields: {"touch": {$cond: [
{$lte: [ {$add: [ "$promoRadius", msr ]}, "$dist" ]}, 0, 1
]}}}
,{$match: {"touch": 1}}
]);
// Assume Mr. Smiley has these params:
var mslng = -90.00;
var mslat = 42.00;
var msr = 0.005; // ~1500 ft radius around him
47
The Pusher
var pts = [ [-74.01,40.70], [-73.99,40.71], . . .];
db.foo.insert({_id:1,
loc: { type:"LineString", coordinates: [ pts[0], pts[1] ]}});
// Push points onto LineString to "extend it” in an index optimized way!
for(i = 2; i < pts.length; i++) {
db.foo.update({_id:1},{$push: {"loc.coordinates”: pts[i]}});
// Perform other functions, e.g.
c=db.foo.find({loc: {$geoIntersects:
{$geometry: { type: ”Polygon", coordinates: … } } });
}
48
Perimeter of Simple Polygon
> db.foo.insert({_id:1, "poly": [ [0,0], [2,12], [4,0], [2,5], [0,0] ] });
> db.foo.insert({_id:2, "poly": [ [2,2], [5,8], [6,0], [3,1], [2,2] ] });
> db.foo.aggregate([
{$project: {"conv": {$map: { input: "$poly", as: "z", in: {
x: {$arrayElemAt: ["$$z”,0]}, y: {$arrayElemAt: ["$$z”,1]}
,len: {$literal: 0} }}}}}
,{$addFields: {first: {$arrayElemAt: [ "$conv", 0 ]} }}
,{$project: {"qqq":
{$reduce: { input: "$conv", initialValue: "$first", in: {
x: "$$this.x”, y: "$$this.y"
,len: {$add: ["$$value.len", // len = oldlen + newLen
{$sqrt: {$add: [
{$pow:[ {$subtract:["$$value.x","$$this.x"]}, 2]}
,{$pow:[ {$subtract:["$$value.y","$$this.y"]}, 2]}
] }} ] } }}
,{$project: {"len": "$qqq.len"}}
{ "_id" : 1, “len" : 35.10137973546188 }
{ "_id" : 2, "len" : 19.346952903339393 }
49
Geospatial = 2D Numeric Indexable Space
Find all branches close to my location:
target = [ someLatitude, someLongitude ];
radians = 10 / 3963.2; // 10 miles
db.coll.find({"location": { $geoWithin :
{ $center : [ target, radians ] }}});
Find nearest investments on efficient frontier:
target = [ risk, reward ];
closeness = someFunction(risk, reward);
db.coll.find({”investmentValue": { $geoWithin :
{ $center : [ target, closeness]}}});
50
Basic Tips & Tricks
• We use [long,lat], not [lat,long] like Google Maps
• Use 2dsphere for geo; avoid legacy $box/$circle/$polygon
• Use 2d for true 2d numeric hacks
• 5 digits beyond decimal is accurate to 1.1m:
• var loc = [ -92.12345, 42.56789] // FINE
• var loc = [ -92.123459261145, 42.567891378902] // ABSURD
• $geoWithin and $geoIntersects do not REQUIRE index
• Remember to close loops (it’s GeoJSON!)
51
esri-related Tips & Tricks
• Shapefiles are everywhere; google shapefile <whatever>
• Crack shapefiles to GeoJSON with python pyshp module
import shapefile
import sys
from json import dumps
reader = shapefile.Reader(sys.argv[1])
field_names = [field[0] for field in reader.fields[1:] ]
buffer = []
for sr in reader.shapeRecords():
buffer.append(dict(geometry=sr.shape.__geo_interface__,
properties=dict(zip(field_names, sr.record))))
sys.stdout.write(dumps({"type": "FeatureCollection”, "features": buffer},
indent=2) + "n”)
Q & A
Thank You!
Buzz Moschetti
Enterprise Architect
buzz.moschetti@mongodb.com
@buzzmoschetti
54
Agenda
• What does geospatial capabilities mean?
• The "levels": DB with geo types, rendering, analytics
• MongoDB brings together geo AND non-geo data
• Geo Data model
• GeoJSON
• Combining GeoJSON with non-geo data
• APIs and Use Cases
• Looking up things contained in a polygon
• Finding things near a point
• Summary of geo ops e.g. $center
• $geoNear and the agg framework
• The power of the document model and MongoDB APIs
• Arrays and rich shapes as first class types
• Comparison to Postgres (PostGIS)

Indexing
• Geospatial queries do not necessarily require an index
• 2d vs. 2dsphere
• Geo stacks and the Ecosystem
• MongoDB and OpenGIS and OpenGEO
• Google Maps
• MEAN
• A Sampling of Geo Solutions
• Mr. Smiley, etc.
• Integration with esri and shapefiles
• esri shapefile cracking
55
Clever Hacks
• John Page date thing
• Mr. Smiley
• Wildfire
• Push pts on a LineString and check for intersects
• Perimeter & Area of simple polygon
• makeNgon
56
MongoDB handles your data + geo
Google handles the maps
Our Drivers
Google Map APIs
Chrome
Other Javascript
Your Data
GeoJSON
• Organization unit is document, then
collection
• Geo data can contain arbitrary peer
data or higher scope within doc
• Proper handling of int, long, double,
and decimal128
• Dates are REAL datetimes
• Uniform indexability and querability
across geo and “regular” data
Your Server-side
code

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction aux bases de données
Introduction aux bases de donnéesIntroduction aux bases de données
Introduction aux bases de données
miniloka25
 

Was ist angesagt? (20)

Conférence - Loi 18-07 du 10 Juin 2018 : la protection des données à caractèr...
Conférence - Loi 18-07 du 10 Juin 2018 : la protection des données à caractèr...Conférence - Loi 18-07 du 10 Juin 2018 : la protection des données à caractèr...
Conférence - Loi 18-07 du 10 Juin 2018 : la protection des données à caractèr...
 
Introduction au langage SQL
Introduction au langage SQLIntroduction au langage SQL
Introduction au langage SQL
 
PostGIS 시작하기
PostGIS 시작하기PostGIS 시작하기
PostGIS 시작하기
 
BigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-ReduceBigData_Chp2: Hadoop & Map-Reduce
BigData_Chp2: Hadoop & Map-Reduce
 
OTB: logiciel libre de traitement d'images satellites
OTB: logiciel libre de traitement d'images satellitesOTB: logiciel libre de traitement d'images satellites
OTB: logiciel libre de traitement d'images satellites
 
Sql
SqlSql
Sql
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
[FOSS4G Korea 2021]Workshop-QGIS-TIPS-20211028
[FOSS4G Korea 2021]Workshop-QGIS-TIPS-20211028[FOSS4G Korea 2021]Workshop-QGIS-TIPS-20211028
[FOSS4G Korea 2021]Workshop-QGIS-TIPS-20211028
 
Big data
Big dataBig data
Big data
 
TP1 Big Data - MapReduce
TP1 Big Data - MapReduceTP1 Big Data - MapReduce
TP1 Big Data - MapReduce
 
Exposé 1
Exposé   1Exposé   1
Exposé 1
 
Aligner vos données avec Wikidata grâce à l'outil Open Refine
Aligner vos données avec Wikidata grâce à l'outil Open RefineAligner vos données avec Wikidata grâce à l'outil Open Refine
Aligner vos données avec Wikidata grâce à l'outil Open Refine
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : Spark
 
Une introduction à Hive
Une introduction à HiveUne introduction à Hive
Une introduction à Hive
 
Formation PHP
Formation PHPFormation PHP
Formation PHP
 
Large Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache SparkLarge Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache Spark
 
Geoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMGeoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSM
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Le "Lac de données" de l'Ina, un projet pour placer la donnée au cœur de l'or...
Le "Lac de données" de l'Ina, un projet pour placer la donnée au cœur de l'or...Le "Lac de données" de l'Ina, un projet pour placer la donnée au cœur de l'or...
Le "Lac de données" de l'Ina, un projet pour placer la donnée au cœur de l'or...
 
Introduction aux bases de données
Introduction aux bases de donnéesIntroduction aux bases de données
Introduction aux bases de données
 

Ähnlich wie Getting Started with Geospatial Data in MongoDB

Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
Murat Çakal
 
Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4
MongoDB
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
Takahiro Inoue
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
MongoDB
 
MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
TO THE NEW | Technology
 

Ähnlich wie Getting Started with Geospatial Data in MongoDB (20)

Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Geoindexing with MongoDB
Geoindexing with MongoDBGeoindexing with MongoDB
Geoindexing with MongoDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
TDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQLTDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQL
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4
 
Social Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDBSocial Data and Log Analysis Using MongoDB
Social Data and Log Analysis Using MongoDB
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
 
Mongodb ExpressJS HandlebarsJS NodeJS FullStack
Mongodb ExpressJS HandlebarsJS NodeJS FullStackMongodb ExpressJS HandlebarsJS NodeJS FullStack
Mongodb ExpressJS HandlebarsJS NodeJS FullStack
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 

Mehr von MongoDB

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Kürzlich hochgeladen (20)

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Getting Started with Geospatial Data in MongoDB

  • 1. Getting Started with Geospatial Data in MongoDB Buzz Moschetti Enterprise Architect buzz.moschetti@mongodb.com @buzzmoschetti
  • 2. 2 Agenda • What is MongoDB? • What does “geospatial capabilities” mean? • GeoJSON • Combining GeoJSON with non-geo data • APIs and Use Cases • Comparison to OGC (Open Geospatial Consortium) • Indexing • Using Geo Capabilities for non-Geo Things • Esri and shapefiles
  • 3. 3 MongoDB: The Post-Relational General Purpose Database Document Data Model Open- Source Fully Featured High Performance Scalable { name: “John Smith”, pfxs: [“Dr.”,”Mr.”], address: “10 3rd St.”, phone: { home: 1234567890, mobile: 1234568138 } }
  • 4. Nexus Architecture Scalability & Performance Always On, Global Deployments Flexibility Expressive Query Language & Secondary Indexes Strong Consistency Enterprise Management & Integrations
  • 5. 5 MongoDB Company Overview ~800 employees 2500+ customers Over $311 million in funding Offices in NY & Palo Alto and across EMEA, and APAC
  • 6. 6 What is “Geo”? At least 4 levels of capability
  • 7. 7 The Geo Stack Efficiently store, query, and index arbitrary points, lines and polygons in the DB
  • 8. 8 The Geo Stack Platform for data analysis of peer data (trades/house value/population/sales/widgets) grouped by geo data Efficiently store, query, and index arbitrary points, lines and polygons in the DB
  • 9. 9 The Geo Stack Graphical rendering of geo shapes on a map Platform for data analysis of peer data (trades/house value/population/sales/widgets) grouped by geo data Efficiently store, query, and index arbitrary points, lines and polygons in the DB
  • 10. 10 The Geo Stack Application(s) to browse and manipulate all the data Graphical rendering of geo shapes on a map Platform for data analysis of peer data (trades/house value/population/sales/widgets) grouped by geo data Efficiently store, query, and index arbitrary points, lines and polygons in the DB
  • 11. 11 Important: Sometimes there is NO Map • Geo stack must support geo functions WITHOUT a Map • Offline reporting • “Nightly fleet management report” • “Distributor loss by assigned area” • Compute/analytical processing • Dynamic polygon generation • Weather catastrophe simulation • Other geo-filtering as input to analytics
  • 12. 12 MongoDB: The Data “Base” Application(s) to browse and visualize Graphical rendering of geo shapes on a map Platform for data analysis of peer data (trades/value/population/sales/widgets) grouped by geo data Efficiently store, query, and index arbitrary points, lines, and polygons in the DB Peer Data Geo Data
  • 13. 13 One Persistor for All Applications & Use Cases Google Map APIs Browser / Mobile Other Javascript Peer Data Geo Data MongoDB node.js Driver MongoDB python Driver Quant / Analytics with pandas Web Service Code MongoDB Java Driver Nightly Reporting Enterprise Integration MongoDB BI Connector Tableau ClickView PowerBI Talend Mule Informatica
  • 14. 14 Balanced Reporting • Most other NoSQL DBs do not have this capability • Oracle, Postgres, MySQL, SQLServer do offer it and subscribe to Open GeoS Consortium (OGC) standards
  • 15. 15 MongoDB data model is the major difference MongoDB: Simple, parse-free, type-correct APIs and data to manipulate and interrogate geo shapes a.k.a. arrays (of arrays (of arrays)) OpenGIS: Piles of “ST_” functions: http://postgis.net/docs/reference.html#Geometry_Accessors SELECT ST_MakePolygon( ST_GeomFromText( 'LINESTRING(75.15 29.53,77 29,77.6 29.5, 75.15 29.53)'));
  • 17. 17 Legacy: 2D points { name: {f: “Buzz”, l: “Moschetti”}, favoritePets: [ “dog”, “cat” ], house: [ -95.12345, 43.23423 ] }
  • 18. 18 Better: GeoJSON { name: {f: “Buzz”, l: “Moschetti”}, favoritePets: [ “dog”, “cat” ], house: { type: “Point”, coordinates: [ -95.12345, 43.23423 ] } }
  • 19. 19 Better: GeoJSON { name: “Superfund132”, location: { type: “Polygon”, coordinates: [ [ [-95.12345, 43.2342],[-95.12456,43.2351],…] [ [-92.8381, 43.75], … ] // “hole” ] } }
  • 20. 20 The GeoJSON Family { type: “Point”, “MultiPoint”, “LineString”,”MultiLineString”, “Polygon”, “MultiPolygon” coordinates: [ specific to type ] } { type: “GeometryCollection” geometries: [ { type: (one of above), coordinates: [ . . . ] } ] NO COMPUTED SHAPES (Circle, Arc, Box, etc.) We use the WGS84 standard: http://spatialreference.org/ref/epsg/4326/
  • 21. 21 MongoDB Data Types are Geo-friendly var poly = [ [ [-95.12345,43.2342],[-95.12345,43.2351], [-95.12211,43.2351],[-95.12211,43.2342], [-95.12345,43.2342] // close the loop! ] ]; db.myCollection.insert( {name: {f: "Buzz", l: "Moschetti"}, favoritePets: ["dog", "cat"], geo: { type: "Polygon“, coordinates: poly } }));
  • 22. 22 … even with Java Document doc = new Document(); doc.put("name", ”Superfund132"); List ring = new ArrayList(); addPoint(ring, -95.12345, 43.2342); addPoint(ring, -95.12345, 43.2351); addPoint(ring, -95.12211, 43.2351); addPoint(ring, -95.12211, 43.2342); addPoint(ring, -95.12345, 43.2342); List poly = new ArrayList(); poly.add(ring); Map mm = new HashMap(); mm.put("type", "Polygon"); mm.put("coordinates", poly); doc.put("geo", mm); coll.insertOne(doc); static void addPoint(List ll, double lng, double lat) { List pt = new ArrayList(); pt.add(lng); pt.add(lat); ll.add(pt); }
  • 23. 23 All Types Are Preserved Correctly Document doc = coll.find().first(); recursiveWalk(doc); name: java.lang.String: Superfund132 geo: com.mongodb.BasicDBObject type: java.lang.String: Polygon coordinates: com.mongodb.BasicDBList 0: com.mongodb.BasicDBList 0: com.mongodb.BasicDBList 0: java.lang.Double: -95.12345 1: java.lang.Double: 43.2342 1: com.mongodb.BasicDBList 0: java.lang.Double: -95.12345 1: java.lang.Double: 43.2351 2: com.mongodb.BasicDBList 0: java.lang.Double: -95.12211 1: java.lang.Double: 43.2351
  • 24. 24 Comparison to “Good” PostGIS import org.postgis.PGgeometry; // extended from org.postgresql.util.PGobject ((org.postgresql.Connection)conn).addDataType("geometry","org.postgis.PGgeometry" ) String sql = "select geom from someTable”; ResultSet r = stmt.executeQuery(sql); while( r.next() ) { PGgeometry geom = (PGgeometry)r.getObject(1); if( geom.getType() = Geometry.POLYGON ) { Polygon pl = (Polygon)geom.getGeometry(); for( int r = 0; r < pl.numRings(); r++) { LinearRing rng = pl.getRing(r); . . . } }
  • 25. 25 Comparison to most OpenGIS String sql = "select ST_AsText(geom) from someTable”; ResultSet r = stmt.executeQuery(sql); while( r.next() ) { String wkt = r.getString(1); // wkt is ”POLYGON((0 0,0 1,1 1,1 0,0 0))” // http://en.wikipedia.org/wiki/Well-known_text // Now we have to parse the string into // an array of array of doubles. // Don’t want to introduce a 3rd party dependency… // So . . . We write our own parser. }
  • 26. 26 Checkpoint We have data in and out of the DB using basic operations (insert and find) Now we need to make it performant!
  • 27. 27 Indexing collection.createIndex({loc:”2d”}) When to use: • Your database has legacy location data from MongoDB 2.2 or earlier • You do not intend to store any location data as GeoJSON objects • “Special Use Cases” e.g. arbitrary two numeric dimension indexing collection.createIndex({loc:”2dsphere”}) When to use: • Supports all GeoJSON objects and legacy [x,y] pairs collection.createIndex({loc:”geoHaystack”}) When to use: • Special small area flat (planar) lookup optimization
  • 28. 28 Indexing collection.createIndex({loc:”2d”}) When to use: • Your database has legacy location data from MongoDB 2.2 or earlier • You do not intend to store any location data as GeoJSON objects • “Special Use Cases” e.g. arbitrary two numeric dimension indexing collection.createIndex({loc:”2dsphere”}) When to use: • Supports all GeoJSON objects and legacy [x,y] pairs collection.createIndex({loc:”geoHaystack”}) When to use: • Special small area flat (planar) lookup optimization
  • 29. 29 find()/$match and Indexing Operator Geometry Arg Type 2d 2dsphere $geoWithin $box,$center,$polygon Y N $geometry: { type, coordinates } N Y $centerSphere: [ [x,y], radians ] Y Y $geoIntersects $geometry only N Y $near,$nearSphere [x,y] R - (output sorted by distance) $geometry: {type, coordinates} - R + $minDistance N Y + $maxDistance Y Y Y = will assist N = will not assist R = REQUIRED Syntax helper: find(“loc”:{$geoWithin: {$box: [ [x0,y0], [x1,y2] }}); find(“loc”:{$geoWithin: {$geometry: { type: “Polygon”, coordinates: [ …. ] }}} );
  • 30. 30 Aggregation Framework: $geoNear Option 2D 2dsphere $geoNear (output sorted by distance) near: { type: “Point”, coordinates } - R - and spherical:true near: [ x, y ] R (or) R query: { expression INCL geo find() on previous page EXCEPT $near} N N Y = will assist N = will not assist R = REQUIRED Important Considerations: 1. You can only use $geoNear as the first stage of a pipeline. 2. You must include the distanceField option. 3. The collection must have only one geospatial index: one 2d index or one 2dsphere index. 4. You do not need to specify which field in the documents hold the coordinate pair or point. Because $geoNear requires that the collection have a single geospatial index, $geoNear implicitly uses the indexed field.
  • 31. 31 Use Cases How do we bring data representation, fast lookup, and . . .
  • 32. 32 Case #1: Find Things in a Given Area + More • Docs contain Points (or possibly “small” polygons) • $geoWithin db.site.aggregate([ {$match: { "loc": { $geoWithin: { $geometry: { type: "Polygon", coordinates: [ coords ] }}} ,"portfolio_id": portid ,“insuredValue“: {$gt: 1000000} ,“insuredDate“: {$gt: new ISODate(„2016-01-01“) }} ,{$bucket: {groupBy: „$insuredValue“, boundaries: [ 1000000, 2000000, 5000000, 10000000, 20000000, Infinity] }} . . .
  • 33. 33 Case #2: Find Things in an Area Stored in DB • Get the shape from the “shapes” collection via query: db.shapes.findOne({predicate},{theShape:1}); • Turn around and query the target collection, e.g. buildingSites with shape: db.buildingSites.find({loc:{$geoWithin: theShape}})
  • 34. 34 Case #3: Find Things Closest to where I am db.buildingSites.aggregate([{$geoNear: { point … }}); • Results returned already in sorted order by closeness
  • 35. 35 Case #3.5: Find Things Closest to where I am but within some bounds • db.buildingSites.aggregate([ {$geoNear: { query: { “loc”: {$geoWithin: {$centerSphere: … }} } (or) query: { “loc”: {$geoWithin: {$geometry: GeoJSON }} } }} ])
  • 36. 36 When the Database isn’t enough
  • 37. 37 When the Database isn’t enough • VERY fast intersection/within for many objects given probes at high velocity (10000s/sec). • Geo manipulation: unions, deltas, layering • Dynamic/programmatic geo construction • Advanced features: smoothing, simplifiers, centroids, …
  • 38. 38 You Need Three Things • Basic geo objects • Geo operators like intersects, within, etc. • Algos and smoothers, etc.
  • 39. 39 com.vividsolutions.jts Map m = (Map) dbo.get("loc"); // get a GeoJSON object from MongoDB List coords = (List) m.get("coordinates"); List outerRing = (List) coords.get(0); // polygon is array of array of pts CoordinateSequence pseq = new CoordinateGeoJSONSequence(outerRing, true); LinearRing outerlr = new LinearRing(pseq, gf); int numHoles = coords.size() - 1; // -1 to skip outer ring; LinearRing[] holes = null; if(numHoles > 0) { holes = new LinearRing[numHoles]; for(int k = 0; k < numHoles; k++) { List innerRing = (List) coords.get(k+1); // +1 adj for outer ring CoordinateSequence psx = new CoordinateGeoJSONSequence(innerRing, true); holes[k] = new LinearRing(psx, gf); } } Polygon poly1 = new Polygon(outerlr, holes, gf); // ok if holes was null Point pt1 = new Point(pseq2, gf); boolean a = pt1.intersects(poly1); Geometry simplified = TopologyPreservingSimplifier.simplify(poly1, tolerance);
  • 40. 40 The Ecosystem • OpenGeo runs over MongoDB! http://suite.opengeo.org/docs/latest/dataadmin/mongodb/index.html • BoundlessGeo: Commerical support for OpenGeo over MongoDB * Provides top 2 tiers (viz, analysis) * https://boundlessgeo.com
  • 41. 41 Geo Capabilities beyond “Simple Geo”
  • 42. 42 Geo as Date Range { who: ‘john’ where: ‘mongodb’ what: ‘lightning talk’ start: ISODate(“2016-06-30T15:00:00”) end: ISODate(“2016-06-30T15:05:00”) } What events were happening at 15:03? collection.find({ start : { $lte:ISODate(“2016-06-30T15:05:03”)}, end: { $gte:ISODate(“2016-06-30T15:05:03”)} })
  • 43. 43 Geo as Date Range • Ranges on 2 attributes – Two BTree walks (intersection) • Assuming time can be anywhere in range of records, index walk is average 50% of index • Test: Macbook Pro, i5, 16GB RAM, data fits in WT Cache easily. Warmed up. Average of 100 runs 694ms /query using index 487ms /query – COLLSCAN!
  • 44. (StartDate,EndDate)  Range Type using Geo LineString “Query” longitude = t Lat=“quantized”variable 0 1 2 { who: ‘john’ where: ‘mongodb’ what: ‘lightning talk’ start: ISODate(“2016-06-30T15:00:00”) end: ISODate(“2016-06-30T15:05:00”) se_equiv: { type: “LineString”, coordinates: [[ -123.1232, 1], [-121.6253, 1]] }
  • 45. 45 Over 10X performance improvement! start2 = (((start / yearsecs) - 45) *90) – 90 end2 = (((end / yearsecs) - 45) *90) – 90 event = { type: "LineString", coordinates: [ [ start2, 1 ], [end2, 1 ] ] } // dx = is the LineString “query” query = { g: { $geoIntersects: { $geometry: { type: "LineString", coordinates: [ [ dx, 0 ], [dx, 2 ] ] } } } } 45ms! 
  • 46. 46 Mr. Smiley db.participatingVendors.aggregate([ // Stage 1: The Bounding Box {$match: { "loc": { $geoWithin: { $geometry: { type: "Polygon", coordinates: mapBoundingBox}}} }} // Stage 2: Compute distance from Mr. Smiley to the points: Pythagorean theorem: ,{$addFields: {dist: {$sqrt: {$add: [ {$pow:[ {$subtract: [ {$arrayElemAt:[ "$loc.coordinates",0]}, mslng]} , 2]} ,{$pow:[ {$subtract:[ {$arrayElemAt:[ "$loc.coordinates", 1]}, mslat]} , 2]} ]} }}} // Stage 3: If the distance between points is LTE the sum of the radii, then // Mr. Smiley's circle intersects that of the participant: // Project 0 (no touch) or 1 (touch) ,{$addFields: {"touch": {$cond: [ {$lte: [ {$add: [ "$promoRadius", msr ]}, "$dist" ]}, 0, 1 ]}}} ,{$match: {"touch": 1}} ]); // Assume Mr. Smiley has these params: var mslng = -90.00; var mslat = 42.00; var msr = 0.005; // ~1500 ft radius around him
  • 47. 47 The Pusher var pts = [ [-74.01,40.70], [-73.99,40.71], . . .]; db.foo.insert({_id:1, loc: { type:"LineString", coordinates: [ pts[0], pts[1] ]}}); // Push points onto LineString to "extend it” in an index optimized way! for(i = 2; i < pts.length; i++) { db.foo.update({_id:1},{$push: {"loc.coordinates”: pts[i]}}); // Perform other functions, e.g. c=db.foo.find({loc: {$geoIntersects: {$geometry: { type: ”Polygon", coordinates: … } } }); }
  • 48. 48 Perimeter of Simple Polygon > db.foo.insert({_id:1, "poly": [ [0,0], [2,12], [4,0], [2,5], [0,0] ] }); > db.foo.insert({_id:2, "poly": [ [2,2], [5,8], [6,0], [3,1], [2,2] ] }); > db.foo.aggregate([ {$project: {"conv": {$map: { input: "$poly", as: "z", in: { x: {$arrayElemAt: ["$$z”,0]}, y: {$arrayElemAt: ["$$z”,1]} ,len: {$literal: 0} }}}}} ,{$addFields: {first: {$arrayElemAt: [ "$conv", 0 ]} }} ,{$project: {"qqq": {$reduce: { input: "$conv", initialValue: "$first", in: { x: "$$this.x”, y: "$$this.y" ,len: {$add: ["$$value.len", // len = oldlen + newLen {$sqrt: {$add: [ {$pow:[ {$subtract:["$$value.x","$$this.x"]}, 2]} ,{$pow:[ {$subtract:["$$value.y","$$this.y"]}, 2]} ] }} ] } }} ,{$project: {"len": "$qqq.len"}} { "_id" : 1, “len" : 35.10137973546188 } { "_id" : 2, "len" : 19.346952903339393 }
  • 49. 49 Geospatial = 2D Numeric Indexable Space Find all branches close to my location: target = [ someLatitude, someLongitude ]; radians = 10 / 3963.2; // 10 miles db.coll.find({"location": { $geoWithin : { $center : [ target, radians ] }}}); Find nearest investments on efficient frontier: target = [ risk, reward ]; closeness = someFunction(risk, reward); db.coll.find({”investmentValue": { $geoWithin : { $center : [ target, closeness]}}});
  • 50. 50 Basic Tips & Tricks • We use [long,lat], not [lat,long] like Google Maps • Use 2dsphere for geo; avoid legacy $box/$circle/$polygon • Use 2d for true 2d numeric hacks • 5 digits beyond decimal is accurate to 1.1m: • var loc = [ -92.12345, 42.56789] // FINE • var loc = [ -92.123459261145, 42.567891378902] // ABSURD • $geoWithin and $geoIntersects do not REQUIRE index • Remember to close loops (it’s GeoJSON!)
  • 51. 51 esri-related Tips & Tricks • Shapefiles are everywhere; google shapefile <whatever> • Crack shapefiles to GeoJSON with python pyshp module import shapefile import sys from json import dumps reader = shapefile.Reader(sys.argv[1]) field_names = [field[0] for field in reader.fields[1:] ] buffer = [] for sr in reader.shapeRecords(): buffer.append(dict(geometry=sr.shape.__geo_interface__, properties=dict(zip(field_names, sr.record)))) sys.stdout.write(dumps({"type": "FeatureCollection”, "features": buffer}, indent=2) + "n”)
  • 52. Q & A
  • 53. Thank You! Buzz Moschetti Enterprise Architect buzz.moschetti@mongodb.com @buzzmoschetti
  • 54. 54 Agenda • What does geospatial capabilities mean? • The "levels": DB with geo types, rendering, analytics • MongoDB brings together geo AND non-geo data • Geo Data model • GeoJSON • Combining GeoJSON with non-geo data • APIs and Use Cases • Looking up things contained in a polygon • Finding things near a point • Summary of geo ops e.g. $center • $geoNear and the agg framework • The power of the document model and MongoDB APIs • Arrays and rich shapes as first class types • Comparison to Postgres (PostGIS) Indexing • Geospatial queries do not necessarily require an index • 2d vs. 2dsphere • Geo stacks and the Ecosystem • MongoDB and OpenGIS and OpenGEO • Google Maps • MEAN • A Sampling of Geo Solutions • Mr. Smiley, etc. • Integration with esri and shapefiles • esri shapefile cracking
  • 55. 55 Clever Hacks • John Page date thing • Mr. Smiley • Wildfire • Push pts on a LineString and check for intersects • Perimeter & Area of simple polygon • makeNgon
  • 56. 56 MongoDB handles your data + geo Google handles the maps Our Drivers Google Map APIs Chrome Other Javascript Your Data GeoJSON • Organization unit is document, then collection • Geo data can contain arbitrary peer data or higher scope within doc • Proper handling of int, long, double, and decimal128 • Dates are REAL datetimes • Uniform indexability and querability across geo and “regular” data Your Server-side code