SlideShare a Scribd company logo
1 of 22
Scalable Geospatial Queries
with Presto
Maria Basmanova
July 2018
Geospatial Data
• Values of type Geometry
• Points – location information (latitude and longitude)
• Lines – roads, cables
• Polygons – countries, regions, provinces, cities, cell tower coverage areas
• Stored as strings in Well-Known-Text (WKT) format
CC BY-SA 3.0 https://en.wikipedia.org/wiki/Well-known_text
• Multi-* - a collection of geometries of the same type
Multi-Geometry Types
CC BY-SA 3.0 https://en.wikipedia.org/wiki/Well-known_text
• A collection of geometries of different types
• Used to capture the result of an operation,
• e.g. intersection, difference, etc.
GeometryCollection
intersection
LINESTRING (…)
POLYGON(…)
GEOMETRYCOLLECTION
(LINESTRING(…), POINT(…))
Geospatial Functions
• ISO Standard - SQL/MM Part 3
• MM – multimedia
• Part 3 Spatial
• ST_ prefix (S – spatial, T – temporal)
• https://prestodb.io/docs/current/functions/geospatial.html
WKT-to-Geometry
• To Geometry
• ST_GeometryFromText(wkt)
• ST_Point(x, y)
• ST_Point(longitude, latitude)
• To WKT
• ST_AsText
Operations
• Inputs (and outputs) are geometry objects, not WKT strings
ST_Contains(g1, g2) ST_Intersection(g1, g2)
ST_Intersects(g1, g2) ST_ConvexHull(g)
ST_Distance(g1, g2) * ST_Union(g1, g2)
ST_Area(g) * ST_Centroid(g)
ST_Length(g) * ST_Envelope(g)
(*) Computation is done on Eucledian plane in the units of the input geometries
Spatial Join
• ST_Contains, ST_Intersects and ST_Distance
• R-Tree index for the build side
CC BY-SA 3.0 https://en.wikipedia.org/wiki/R-tree
SELECT *
FROM points, polygons
WHERE ST_Contains(ST_GeometryFromText(wkt), ST_Point(lng, lat))
Spatial Join Types
• Inner join
• Left join enables scalar correlated subqueries
SELECT (SELECT arbitrary(name) FROM polygons WHERE ST_Contains(polygon, ST_Point(lng, lat)))
FROM points
Distance Query
• Logically equivalent to ST_Contains(circle(b.point, radius), a.point)
• Radius can be a constant value or an expression using symbols from b
• A lot more efficient then ST_Contains(ST_Buffer(b.point, radius), a.point)
• What about the units?
SELECT * FROM a, b
WHERE ST_Distance(a.point, b.point) <= radius
Angular units
• 1 degree of latitude =~ 111.321 km and stays constant
• 1 degree of longitude =~ 111.321 km * cos(latitude)
• ST_Distance, ST_Area, ST_Length return results in angular units
• Within small areas, multiply by
• 111.321 km * cos(radians(ST_Y(ST_Centroid(ST_Envelope(g1)))))
Latitude at the center of the
bounding box of g1
Distance Query in km: Step 1
• ST_Distance(center, p) <= r / 111.321
• For r = 1
• Circle of 1 km near equator
• Ellipse with minor axis along the longitude
• and smaller diameter of 0.34 km at 70th
latitude
Distance Query in km: Step 2
• ST_Distance(center, p) <= r / (111.321 * cos(radians(center.latitude))))
• Ellipse with minor axis fixed at r km
• major axis starting at r km near equator
• and growing to 3r at 70th latitude
Distance Query in km: Step 3
SELECT *
FROM a, b
WHERE ST_Distance(ST_Point(a.lng, a.lat), ST_Point(b.lng, b.lat)) <=
radius_km / (111.321 * cos(radians(b.lat)))
AND great_circle_distance(a.lat, a.lng, b.lat, b.lng) <= radius_km
• Divide the radius by 111.321 * cos(latitude)
• Refine spatial join results using great_circle_distance
Bing Tiles
© 2018 Microsoft https://msdn.microsoft.com/en-us/library/bb259689.aspx.
Bing Tile Functions
• bing_tile_at(latitude, longitude, zoom_level)
• bing_tiles_around(latitude, longitude, zoom_level)
• geometry_to_bing_tiles(geometry, zoom_level)
• Choose zoom level based on radius
• tile width >= radius
• Refine join results using great_circle_distance
Distance Query using Bing Tiles
SELECT *
FROM a, (
SELECT * FROM b
CROSS JOIN UNNEST (bing_tiles_around(lat, lng, 14)) as t(tile)
) b
WHERE bing_tile_at(a.lat, a.lng, 14) = b.tile
AND great_circle_distance(a.lat, a.lng, b.lat, b.lng) <= radius_km
• Tile size depends on zoom level and latitude
• Smaller tiles at larger zoom levels and near the polls
How Large are Bing Tiles?
Tile width in kilometers
Questions?
Spatial Join
• Spatial joins are similar to Hash joins
• Hash-based partitioning -> Spatial partitioning
• Hash table -> Spatial Index (R-Tree)
• Broadcast spatial join requires only spatial index
SELECT *
FROM polygons, points
WHERE ST_Contains(ST_GeometryFromText(wkt), ST_Point(lng, lat))
CC BY-SA 3.0 https://en.wikipedia.org/wiki/R-tree
Spatial Partitioning
• Overall extent is split into non-overlapping rectanges
• KDB-Tree (K = 2)
• Total number of records, overall extent and a sample of
the data is needed to compute the partitioning scheme
• Some records may go into multiple partitions
• Polygons may intersect multiple rectangles
• Efficient inline de-dup technique is necessary
• Reference point of the intersection of bounding boxes
Inline Deduplication
• Some shapes intersect multiple partitions
• Only one partition contains a reference point
• Lower left corner of the intersection of bounding boxes

More Related Content

What's hot

UNIT II LINEAR DATA STRUCTURES – STACKS, QUEUES
UNIT II 	LINEAR DATA STRUCTURES – STACKS, QUEUES	UNIT II 	LINEAR DATA STRUCTURES – STACKS, QUEUES
UNIT II LINEAR DATA STRUCTURES – STACKS, QUEUES Kathirvel Ayyaswamy
 
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017Amazon Web Services
 
security misconfigurations
security misconfigurationssecurity misconfigurations
security misconfigurationsMegha Sahu
 
A5: Security Misconfiguration
A5: Security Misconfiguration A5: Security Misconfiguration
A5: Security Misconfiguration Tariq Islam
 
Binary Search Tree
Binary Search TreeBinary Search Tree
Binary Search TreeZafar Ayub
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Salah Amean
 
Discrete Mathematics Lecture Notes
Discrete Mathematics Lecture NotesDiscrete Mathematics Lecture Notes
Discrete Mathematics Lecture NotesFellowBuddy.com
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kambererror007
 
trees in data structure
trees in data structure trees in data structure
trees in data structure shameen khan
 
UNIT III NON LINEAR DATA STRUCTURES – TREES
UNIT III 	NON LINEAR DATA STRUCTURES – TREESUNIT III 	NON LINEAR DATA STRUCTURES – TREES
UNIT III NON LINEAR DATA STRUCTURES – TREESKathirvel Ayyaswamy
 
Data Structures - Lecture 7 [Linked List]
Data Structures - Lecture 7 [Linked List]Data Structures - Lecture 7 [Linked List]
Data Structures - Lecture 7 [Linked List]Muhammad Hammad Waseem
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptxNTUConcepts1
 
Circular linked list
Circular linked listCircular linked list
Circular linked listdchuynh
 

What's hot (20)

UNIT II LINEAR DATA STRUCTURES – STACKS, QUEUES
UNIT II 	LINEAR DATA STRUCTURES – STACKS, QUEUES	UNIT II 	LINEAR DATA STRUCTURES – STACKS, QUEUES
UNIT II LINEAR DATA STRUCTURES – STACKS, QUEUES
 
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
security misconfigurations
security misconfigurationssecurity misconfigurations
security misconfigurations
 
A5: Security Misconfiguration
A5: Security Misconfiguration A5: Security Misconfiguration
A5: Security Misconfiguration
 
Binary Search Tree
Binary Search TreeBinary Search Tree
Binary Search Tree
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Discrete Mathematics Lecture Notes
Discrete Mathematics Lecture NotesDiscrete Mathematics Lecture Notes
Discrete Mathematics Lecture Notes
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
 
trees in data structure
trees in data structure trees in data structure
trees in data structure
 
Quicksort
QuicksortQuicksort
Quicksort
 
Graph theory
Graph theory Graph theory
Graph theory
 
Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
 
UNIT III NON LINEAR DATA STRUCTURES – TREES
UNIT III 	NON LINEAR DATA STRUCTURES – TREESUNIT III 	NON LINEAR DATA STRUCTURES – TREES
UNIT III NON LINEAR DATA STRUCTURES – TREES
 
Data Structures - Lecture 7 [Linked List]
Data Structures - Lecture 7 [Linked List]Data Structures - Lecture 7 [Linked List]
Data Structures - Lecture 7 [Linked List]
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
 
Circular linked list
Circular linked listCircular linked list
Circular linked list
 
Red Black Tree Insertion & Deletion
Red Black Tree Insertion & DeletionRed Black Tree Insertion & Deletion
Red Black Tree Insertion & Deletion
 
SQL injection
SQL injectionSQL injection
SQL injection
 

Similar to Presto Summit 2018 - 06 - Facebook Geospatial

Geek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial DataGeek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial DataIDERA Software
 
Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Texas Natural Resources Information System
 
Geodesic algorithms: an experimental study
Geodesic algorithms: an experimental studyGeodesic algorithms: an experimental study
Geodesic algorithms: an experimental studyVissarion Fisikopoulos
 
Mar 8 single_map_analysis_1
Mar 8 single_map_analysis_1Mar 8 single_map_analysis_1
Mar 8 single_map_analysis_1dellissimo
 
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)The Earth is not flat; but it's not round either (Geography on Boost.Geometry)
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)Vissarion Fisikopoulos
 
Traversing Notes |surveying II | Sudip khadka
Traversing Notes |surveying II | Sudip khadka Traversing Notes |surveying II | Sudip khadka
Traversing Notes |surveying II | Sudip khadka Sudip khadka
 
Project 2- traversing
Project 2- traversingProject 2- traversing
Project 2- traversingseenyee
 
Project 2- traversing
Project 2- traversingProject 2- traversing
Project 2- traversingseenyee
 
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...Accumulo Summit
 
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATION
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATIONOBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATION
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATIONSaurabh Giratkar
 
SQLBits X SQL Server 2012 Spatial
SQLBits X SQL Server 2012 SpatialSQLBits X SQL Server 2012 Spatial
SQLBits X SQL Server 2012 SpatialMichael Rys
 
Global Map Matching using BLE Beacons for Indoor Route and Stay Estimation
Global Map Matching using BLE Beacons for Indoor Route and Stay EstimationGlobal Map Matching using BLE Beacons for Indoor Route and Stay Estimation
Global Map Matching using BLE Beacons for Indoor Route and Stay EstimationDaisuke Yamamoto
 
Mapwork skills.pptx
Mapwork skills.pptxMapwork skills.pptx
Mapwork skills.pptxKarl Mberema
 
TYBSC IT PGIS Unit IV Spacial Data Analysis
TYBSC IT PGIS Unit IV  Spacial Data AnalysisTYBSC IT PGIS Unit IV  Spacial Data Analysis
TYBSC IT PGIS Unit IV Spacial Data AnalysisArti Parab Academics
 

Similar to Presto Summit 2018 - 06 - Facebook Geospatial (20)

Geek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial DataGeek Sync | Having Fun with Spatial Data
Geek Sync | Having Fun with Spatial Data
 
Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...Covering the earth and the cloud the next generation of spatial in sql server...
Covering the earth and the cloud the next generation of spatial in sql server...
 
Geodesic algorithms: an experimental study
Geodesic algorithms: an experimental studyGeodesic algorithms: an experimental study
Geodesic algorithms: an experimental study
 
Day 6 - PostGIS
Day 6 - PostGISDay 6 - PostGIS
Day 6 - PostGIS
 
Mar 8 single_map_analysis_1
Mar 8 single_map_analysis_1Mar 8 single_map_analysis_1
Mar 8 single_map_analysis_1
 
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)The Earth is not flat; but it's not round either (Geography on Boost.Geometry)
The Earth is not flat; but it's not round either (Geography on Boost.Geometry)
 
Gis basic
Gis basicGis basic
Gis basic
 
Gis Concepts 3/5
Gis Concepts 3/5Gis Concepts 3/5
Gis Concepts 3/5
 
Traversing Notes |surveying II | Sudip khadka
Traversing Notes |surveying II | Sudip khadka Traversing Notes |surveying II | Sudip khadka
Traversing Notes |surveying II | Sudip khadka
 
GIS
GISGIS
GIS
 
Project 2- traversing
Project 2- traversingProject 2- traversing
Project 2- traversing
 
Project 2- traversing
Project 2- traversingProject 2- traversing
Project 2- traversing
 
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
Accumulo Summit 2015: GeoWave: Geospatial and Geotemporal Data Storage and Re...
 
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATION
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATIONOBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATION
OBJECT DECOMPOSITION BASED ON SKELETON ANALYSIS FOR ROAD EXTRATION
 
SQLBits X SQL Server 2012 Spatial
SQLBits X SQL Server 2012 SpatialSQLBits X SQL Server 2012 Spatial
SQLBits X SQL Server 2012 Spatial
 
Triangulation survey
Triangulation surveyTriangulation survey
Triangulation survey
 
Global Map Matching using BLE Beacons for Indoor Route and Stay Estimation
Global Map Matching using BLE Beacons for Indoor Route and Stay EstimationGlobal Map Matching using BLE Beacons for Indoor Route and Stay Estimation
Global Map Matching using BLE Beacons for Indoor Route and Stay Estimation
 
Mapwork skills.pptx
Mapwork skills.pptxMapwork skills.pptx
Mapwork skills.pptx
 
TYBSC IT PGIS Unit IV Spacial Data Analysis
TYBSC IT PGIS Unit IV  Spacial Data AnalysisTYBSC IT PGIS Unit IV  Spacial Data Analysis
TYBSC IT PGIS Unit IV Spacial Data Analysis
 
30838893 chain-survey
30838893 chain-survey30838893 chain-survey
30838893 chain-survey
 

More from kbajda

Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubolekbajda
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Icebergkbajda
 
Presto Summit 2018 - 08 - FINRA
Presto Summit 2018  - 08 - FINRAPresto Summit 2018  - 08 - FINRA
Presto Summit 2018 - 08 - FINRAkbajda
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyftkbajda
 
Presto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber ElasticsearchPresto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber Elasticsearchkbajda
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containerskbajda
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedInkbajda
 
Presto Summit 2018 - 01 - Facebook Presto
Presto Summit 2018  - 01 - Facebook PrestoPresto Summit 2018  - 01 - Facebook Presto
Presto Summit 2018 - 01 - Facebook Prestokbajda
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBOkbajda
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CAkbajda
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016kbajda
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 

More from kbajda (13)

Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Presto Summit 2018 - 08 - FINRA
Presto Summit 2018  - 08 - FINRAPresto Summit 2018  - 08 - FINRA
Presto Summit 2018 - 08 - FINRA
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 
Presto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber ElasticsearchPresto Summit 2018 - 05 - Uber Elasticsearch
Presto Summit 2018 - 05 - Uber Elasticsearch
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedIn
 
Presto Summit 2018 - 01 - Facebook Presto
Presto Summit 2018  - 01 - Facebook PrestoPresto Summit 2018  - 01 - Facebook Presto
Presto Summit 2018 - 01 - Facebook Presto
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 

Recently uploaded

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 

Presto Summit 2018 - 06 - Facebook Geospatial

  • 1. Scalable Geospatial Queries with Presto Maria Basmanova July 2018
  • 2. Geospatial Data • Values of type Geometry • Points – location information (latitude and longitude) • Lines – roads, cables • Polygons – countries, regions, provinces, cities, cell tower coverage areas • Stored as strings in Well-Known-Text (WKT) format CC BY-SA 3.0 https://en.wikipedia.org/wiki/Well-known_text
  • 3. • Multi-* - a collection of geometries of the same type Multi-Geometry Types CC BY-SA 3.0 https://en.wikipedia.org/wiki/Well-known_text
  • 4. • A collection of geometries of different types • Used to capture the result of an operation, • e.g. intersection, difference, etc. GeometryCollection intersection LINESTRING (…) POLYGON(…) GEOMETRYCOLLECTION (LINESTRING(…), POINT(…))
  • 5. Geospatial Functions • ISO Standard - SQL/MM Part 3 • MM – multimedia • Part 3 Spatial • ST_ prefix (S – spatial, T – temporal) • https://prestodb.io/docs/current/functions/geospatial.html
  • 6. WKT-to-Geometry • To Geometry • ST_GeometryFromText(wkt) • ST_Point(x, y) • ST_Point(longitude, latitude) • To WKT • ST_AsText
  • 7. Operations • Inputs (and outputs) are geometry objects, not WKT strings ST_Contains(g1, g2) ST_Intersection(g1, g2) ST_Intersects(g1, g2) ST_ConvexHull(g) ST_Distance(g1, g2) * ST_Union(g1, g2) ST_Area(g) * ST_Centroid(g) ST_Length(g) * ST_Envelope(g) (*) Computation is done on Eucledian plane in the units of the input geometries
  • 8. Spatial Join • ST_Contains, ST_Intersects and ST_Distance • R-Tree index for the build side CC BY-SA 3.0 https://en.wikipedia.org/wiki/R-tree SELECT * FROM points, polygons WHERE ST_Contains(ST_GeometryFromText(wkt), ST_Point(lng, lat))
  • 9. Spatial Join Types • Inner join • Left join enables scalar correlated subqueries SELECT (SELECT arbitrary(name) FROM polygons WHERE ST_Contains(polygon, ST_Point(lng, lat))) FROM points
  • 10. Distance Query • Logically equivalent to ST_Contains(circle(b.point, radius), a.point) • Radius can be a constant value or an expression using symbols from b • A lot more efficient then ST_Contains(ST_Buffer(b.point, radius), a.point) • What about the units? SELECT * FROM a, b WHERE ST_Distance(a.point, b.point) <= radius
  • 11. Angular units • 1 degree of latitude =~ 111.321 km and stays constant • 1 degree of longitude =~ 111.321 km * cos(latitude) • ST_Distance, ST_Area, ST_Length return results in angular units • Within small areas, multiply by • 111.321 km * cos(radians(ST_Y(ST_Centroid(ST_Envelope(g1))))) Latitude at the center of the bounding box of g1
  • 12. Distance Query in km: Step 1 • ST_Distance(center, p) <= r / 111.321 • For r = 1 • Circle of 1 km near equator • Ellipse with minor axis along the longitude • and smaller diameter of 0.34 km at 70th latitude
  • 13. Distance Query in km: Step 2 • ST_Distance(center, p) <= r / (111.321 * cos(radians(center.latitude)))) • Ellipse with minor axis fixed at r km • major axis starting at r km near equator • and growing to 3r at 70th latitude
  • 14. Distance Query in km: Step 3 SELECT * FROM a, b WHERE ST_Distance(ST_Point(a.lng, a.lat), ST_Point(b.lng, b.lat)) <= radius_km / (111.321 * cos(radians(b.lat))) AND great_circle_distance(a.lat, a.lng, b.lat, b.lng) <= radius_km • Divide the radius by 111.321 * cos(latitude) • Refine spatial join results using great_circle_distance
  • 15. Bing Tiles © 2018 Microsoft https://msdn.microsoft.com/en-us/library/bb259689.aspx.
  • 16. Bing Tile Functions • bing_tile_at(latitude, longitude, zoom_level) • bing_tiles_around(latitude, longitude, zoom_level) • geometry_to_bing_tiles(geometry, zoom_level)
  • 17. • Choose zoom level based on radius • tile width >= radius • Refine join results using great_circle_distance Distance Query using Bing Tiles SELECT * FROM a, ( SELECT * FROM b CROSS JOIN UNNEST (bing_tiles_around(lat, lng, 14)) as t(tile) ) b WHERE bing_tile_at(a.lat, a.lng, 14) = b.tile AND great_circle_distance(a.lat, a.lng, b.lat, b.lng) <= radius_km
  • 18. • Tile size depends on zoom level and latitude • Smaller tiles at larger zoom levels and near the polls How Large are Bing Tiles? Tile width in kilometers
  • 20. Spatial Join • Spatial joins are similar to Hash joins • Hash-based partitioning -> Spatial partitioning • Hash table -> Spatial Index (R-Tree) • Broadcast spatial join requires only spatial index SELECT * FROM polygons, points WHERE ST_Contains(ST_GeometryFromText(wkt), ST_Point(lng, lat)) CC BY-SA 3.0 https://en.wikipedia.org/wiki/R-tree
  • 21. Spatial Partitioning • Overall extent is split into non-overlapping rectanges • KDB-Tree (K = 2) • Total number of records, overall extent and a sample of the data is needed to compute the partitioning scheme • Some records may go into multiple partitions • Polygons may intersect multiple rectangles • Efficient inline de-dup technique is necessary • Reference point of the intersection of bounding boxes
  • 22. Inline Deduplication • Some shapes intersect multiple partitions • Only one partition contains a reference point • Lower left corner of the intersection of bounding boxes