SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Implementing an Open Source
Spatiotemporal Search Platform for
Spatial Data Infrastructures
OGRS 2016, Perugia, Italy - 10/13/2016
Paolo Corti [1]
, Benjamin Lewis [1]
, Athanasios Tom Kralidis [2]
,
Ntabathia Jude Mwenda [1]
[1] Harvard Center for Geographic Analysis
[2] Open Source Geospatial Foundation
HHypermap (Harvard Hypermap)
● With Funding from the National Endowment for the
Humanities, the Harvard Centre for Geographic
Analysis (CGA) developed HHypermap, a map
services registry and search platform
● HHypermap was developed in the process of
re-engineering the search component of CGA’s public
domain SDI (WorldMap http://worldmap.harvard.edu),
based on GeoNode
● It is built on an open source software stack
A brief history
WorldMap was developed on GeoNode 1.2 and released in 2012 as a public space for
scholars and the public to upload and share spatial data.
Within a year WorldMap had 12,000 datasets and 8000 users.
Finding data became difficult and demand grew for being able to bring in and save
map service layers from other servers.
The CGA proposed to NEH to build a system for building and maintaining a
comprehensive registry of WMS and Esri REST Map services that would plug into
Worldmap.
Being able to search for data by time and space as well as by keyword was a priority
given the user base.
Note on uptake
Since the release of HHypermap on GitHub in April, a U.S. federal agency has adopted
it and Boundless is using it within its flagship platform Boundless Exchange.
The geo-visualization capabilities we developed (2D faceting in Lucene) are highly
scalable and being used to build a “Billion Object Platform” to enable interactive
exploration of a billion spatio-temporal objects.
Glossary
● Spatial Data Infrastructure (SDI)
● Catalogue Service for the Web (CSW)
● Search Engine
Spatial Data Infrastructure (SDI)
A Spatial Data Infrastructure
(SDI) is a framework of
geospatial data, metadata, users
and tools intended to provide
an efficient and flexible way to
use spatial information
Catalogue Service for the Web (CSW)
● One of the key software components of an SDI is the catalogue service
which is needed to discover, query, and manage the metadata
● Catalogue services in an SDI are typically based on the Open Geospatial
Consortium (OGC) Catalogue Service for the Web (CSW) standard
which defines common interfaces for accessing the metadata information
● Notable implementations: pycsw, GeoNetwork
Search Engine
● A search engine is a software system capable of supporting fast and
reliable search
● It provides features such as full text search, natural language processing,
weighted results, fuzzy tolerance results, faceting, hit highlighting
● Highly scalable and replicable architecture
● Notable implementations: Solr and ElasticSearch, both based on Apache
Lucene
Goals of HHypermap
● Provide a framework for building and maintaining a comprehensive registry of
web map services
● Support modern search capabilities such as spatial and temporal faceting and
instant previews via an open API
● Behind the scenes HHypermap scalably harvests OGC and Esri service metadata
from distributed servers, organizes that information, and pushes it to a search
engine
● Monitor services and layers for reliability and use to improve results ranking
● End users can search the SDI metadata using standard interfaces provided by the
internal CSW catalogue, and benefit from advanced search capabilities provided
by a more full featured, RESTful API
Catalogue Service for the Web
● The OGC Catalogue Service for the Web (CSW) standard specify the interfaces
and bindings, as well as a framework for defining the application profiles required
to publish and access digital catalogues of metadata for geospatial data and
services
● Based on the Dublin Core metadata information model, CSW supports broad
interoperability around discovering geospatial data and services spatially,
non-spatially, temporally, and via keywords or freetext
● CSW supports application profiles which allow for information communities to
constrain and/or extend the CSW specification to satisfy specific discovery
requirements and to realize tighter coupling and integration of geospatial data
and services
Limitations of CSW
CSW provides numerous benefits to SDI’s, but there are numerous opportunities to
enhance the functionality of CSW and the server implementations of CSW by adding
in standard search engine features. Some examples:
● Faceted search
● JSON representation (vs XML in CSW)
● Simplified query interface (CSW, being based on XML, can quickly become
complex)
● Text stemming (ability to detect words derived from a common root)
● Highly scalable and replicable architecture
The Need for Search Engines in Spatial Data
Infrastructure
● Numerous types of web application such as CMS, Wikis, data delivery frameworks,
all benefit from improved data discovery
● In the last few years, these applications have delegated the task of search
optimization to specific frameworks known as search engines
● Rather than implementing a custom search logic, these platforms now often add a
search engine in the stack to improve search
● Apache Solr and Elasticsearch, two popular open source search engine web
platforms, and both based on Apache Lucene, can now be part of a typical CMS
stack to support complex search criteria, faceting, result highlighting, query
spellcheck, relevance tuning and more
● As for CMS, SDI search can dramatically benefit if paired with these platforms
How a search engine works
Two distinct phases:
● Indexing: all of the documents (metadata, in the SDI context) that must be
searched are scanned, and a list of search terms (an index) is built. For each search
term, the index keeps track only of the identifiers of the documents that contain
the search term
● Searching: only the index is looked at, and a list of the documents containing the
given search term is quickly returned to the client. This indexed approach makes a
search engine extremely fast in outputting results
Benefits from search engine frameworks
● Very fast, thanks to the indexing mechanism
● Handling the ambiguities of natural languages, with stop words (words filtered out
during the processing of text), stemming (ability to detect words derived from a
common root), synonyms detection, and controlled vocabularies such as thesauri and
taxonomies
● Phrase searches and proximity searches (search for a phrase containing two different
words separated by a specified number of words)
● Weighted results
● Handling regular expressions, wildcard search, and fuzzy search to provide results for a
given term and its common variations
● Support for boolean queries (AND, OR, NOT)
● Hit highlighting
● Highly scalable and replicable
Faceted search
● Faceting is the arrangement of search results in
categories based on indexed terms
● This capability makes it possible for example, to
provide an immediate indication of the number of
times that common keywords are contained in
different metadata documents
● A typical use case for SDI is with metadata
categories, keywords and regions
● Faceting without a search engine is generally
computationally expensive in relational normalized
structures (lots of query in a RDBMS)
Temporal faceting
● Search engines can also support
temporal and spatial faceting, two
features that are extremely useful for
browsing large collections of
geospatial metadata.
● Temporal faceting can display the
number of metadata documents in a
SDI by date range as a kind of
histogram
Spatial faceting
● Spatial faceting can provide a
spatial surface representing
the distribution of layers or
features across an area of
interest
● In this example a heatmap is
generated by spatial faceting
to show the distribution of
layers in the WorldMap SDI
for a given geographic region
HHypermap: an SDI search engine based on Free and
Open Source Software
● HHypermap is an application that manages OGC web services (such as WMS,
WMTS), and Esri REST endpoints
● map service crawling, and harvesting, and uptime statistics gathering for remote
services and layers
● The aim of HHypermap is to provide a more effective search experience to
WorldMap users and also for users outside WorldMap
● WorldMap is an open source mapping platform, based on GeoNode, developed
by the CGA to lower the barrier for scholars who wish to explore, visualize, edit
and publish geospatial information
HHypermap Architecture
Built on Open Source
software:
Celery, RabbitMQ,
Django, Lucene (Solr,
Elasticsearch), MapProxy,
Memcached, OWSLib,
PostgreSQL, PostGIS,
pycsw
Future Work
● While the CSW 3.0.0 standard provides improvements to address mass
market search/discovery, the benefits of search engine implementations
combined with broad interoperability of the CSW standard present
opportunities to integrate the CSW standard with search engine
methodologies
● The authors hope that such an approach will become formalized as a
CSW Application Profile or Best Practice in order to achieve maximum
benefit and adoption in SDI activities
● This will allow CSW implementations to make better use of search engine
methodologies for improving the user search experience in SDI
workflows
Conclusion
HHypermap aims to provide a FOSS solution using modern
approaches to realize a highly scalable, flexible and robust geospatial
registry and catalogue/search platform while achieving broad
interoperability via open standards
References
● Harvard University CGA: http://gis.harvard.edu/
● WorldMap: http://worldmap.harvard.edu/
● Harvard Hypermap public registry: http://hh.worldmap.harvard.edu/
● HHypermap code repository: https://github.com/cga-harvard/HHypermap

Weitere ähnliche Inhalte

Was ist angesagt?

Geonode introduction
Geonode introductionGeonode introduction
Geonode introductionTek Kshetri
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityMongoDB
 
Integrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIntegrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIan Panganiban
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesNiroshan Sanjaya
 
Introduction of Open Source GIS
Introduction of Open Source GISIntroduction of Open Source GIS
Introduction of Open Source GISSANGHEE SHIN
 
Cartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernCartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernUli Müller
 
Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Uli Müller
 
Open your data with CartoDB
Open your data with CartoDBOpen your data with CartoDB
Open your data with CartoDBJorge Sanz
 
Introduction to Open Source GIS
Introduction to Open Source GISIntroduction to Open Source GIS
Introduction to Open Source GISSANGHEE SHIN
 
Mapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonMapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonJoachim Van der Auwera
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みYoichi Kayama
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServerMongoDB
 
Geonode Presentation (ppt)
Geonode Presentation (ppt)Geonode Presentation (ppt)
Geonode Presentation (ppt)Iwl Pcu
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial dataKudos S.A.S
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And GisKudos S.A.S
 
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...Christopher Kotfila
 
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...Christopher Kotfila
 
CartoDB Inside Out
CartoDB Inside OutCartoDB Inside Out
CartoDB Inside OutJorge Sanz
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19ExtremeEarth
 

Was ist angesagt? (20)

Geonode introduction
Geonode introductionGeonode introduction
Geonode introduction
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 
Integrating Geospatial Data to your Applications
Integrating Geospatial Data to your ApplicationsIntegrating Geospatial Data to your Applications
Integrating Geospatial Data to your Applications
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service TechnologiesField Data Collecting, Processing and Sharing: Using web Service Technologies
Field Data Collecting, Processing and Sharing: Using web Service Technologies
 
Geonode 2.0
Geonode 2.0Geonode 2.0
Geonode 2.0
 
Introduction of Open Source GIS
Introduction of Open Source GISIntroduction of Open Source GIS
Introduction of Open Source GIS
 
Cartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in BernCartaro Workshop at the Geosharing Conferenc in Bern
Cartaro Workshop at the Geosharing Conferenc in Bern
 
Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)Cartaro - Geospatial CMS (en)
Cartaro - Geospatial CMS (en)
 
Open your data with CartoDB
Open your data with CartoDBOpen your data with CartoDB
Open your data with CartoDB
 
Introduction to Open Source GIS
Introduction to Open Source GISIntroduction to Open Source GIS
Introduction to Open Source GIS
 
Mapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX LondonMapping, GIS and geolocating data in Java @ JAX London
Mapping, GIS and geolocating data in Java @ JAX London
 
CKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試みCKANへの空間情報機能拡張実装の試み
CKANへの空間情報機能拡張実装の試み
 
MongoDB + GeoServer
MongoDB + GeoServerMongoDB + GeoServer
MongoDB + GeoServer
 
Geonode Presentation (ppt)
Geonode Presentation (ppt)Geonode Presentation (ppt)
Geonode Presentation (ppt)
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial data
 
Open Source Databases And Gis
Open Source Databases And GisOpen Source Databases And Gis
Open Source Databases And Gis
 
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...FOSS4G 2017 - Geonotebook:   an extension to the jupyter notebook for explora...
FOSS4G 2017 - Geonotebook: an extension to the jupyter notebook for explora...
 
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
JupyterCon 2017 - Geonotebook: an extension to the jupyter notebook for explo...
 
CartoDB Inside Out
CartoDB Inside OutCartoDB Inside Out
CartoDB Inside Out
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19
 

Ähnlich wie Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures

GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and InteroperabilityNasr Khashoggi
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Anna Fensel
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesStephane Fellah
 
Geonetwork for Spatial Data
Geonetwork for Spatial DataGeonetwork for Spatial Data
Geonetwork for Spatial DataNizam GIS
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Thierry Badard
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal ServerEsri
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerHannaHorppila
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspireHelsinki2019
 
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspireHelsinki2019
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadataLuis Bermudez
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Amazon Web Services
 
Open Source GIS
Open Source GISOpen Source GIS
Open Source GISJoe Larson
 
Integrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsIntegrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsCommand Prompt., Inc
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal ServerEsri
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesKarel Charvat
 
Seven50 Sparc Overview
Seven50 Sparc OverviewSeven50 Sparc Overview
Seven50 Sparc OverviewRoar Media
 
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...NeGD Capacity Building
 

Ähnlich wie Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures (20)

GIS Standards and Interoperability
GIS Standards and InteroperabilityGIS Standards and Interoperability
GIS Standards and Interoperability
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL Services
 
Geonetwork for Spatial Data
Geonetwork for Spatial DataGeonetwork for Spatial Data
Geonetwork for Spatial Data
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal Server
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
 
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De LathouwerInspire Helsinki 2019 - Keynote Bart De Lathouwer
Inspire Helsinki 2019 - Keynote Bart De Lathouwer
 
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De LathouwerInspire Helsinki 2019 Keynote by Bart De Lathouwer
Inspire Helsinki 2019 Keynote by Bart De Lathouwer
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
 
Cop gise
Cop giseCop gise
Cop gise
 
Geohosting
GeohostingGeohosting
Geohosting
 
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016Next Generation Open Data Platforms | AWS Public Sector Summit 2016
Next Generation Open Data Platforms | AWS Public Sector Summit 2016
 
Open Source GIS
Open Source GISOpen Source GIS
Open Source GIS
 
Integrating PostGIS in Web Applications
Integrating PostGIS in Web ApplicationsIntegrating PostGIS in Web Applications
Integrating PostGIS in Web Applications
 
Esri Geoportal Server
Esri Geoportal ServerEsri Geoportal Server
Esri Geoportal Server
 
Upgrading maps with Linked Data
Upgrading maps with Linked DataUpgrading maps with Linked Data
Upgrading maps with Linked Data
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communities
 
Seven50 Sparc Overview
Seven50 Sparc OverviewSeven50 Sparc Overview
Seven50 Sparc Overview
 
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
Syntactic and semantic based approaches for Geoinformation Management - Dr. S...
 

Mehr von Paolo Corti

State of GeoNode 2019
State of GeoNode 2019State of GeoNode 2019
State of GeoNode 2019Paolo Corti
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Paolo Corti
 
Making Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructureMaking Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructurePaolo Corti
 
Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesPaolo Corti
 
Status of WorldMap, 2016
Status of WorldMap, 2016Status of WorldMap, 2016
Status of WorldMap, 2016Paolo Corti
 
GeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitarieGeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitariePaolo Corti
 
GeoNode intro and demo
GeoNode intro and demoGeoNode intro and demo
GeoNode intro and demoPaolo Corti
 
GeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionGeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionPaolo Corti
 
L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...Paolo Corti
 
Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Paolo Corti
 
Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Paolo Corti
 

Mehr von Paolo Corti (11)

State of GeoNode 2019
State of GeoNode 2019State of GeoNode 2019
State of GeoNode 2019
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
 
Making Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data InfrastructureMaking Temporal Search Central in a Spatial Data Infrastructure
Making Temporal Search Central in a Spatial Data Infrastructure
 
Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queues
 
Status of WorldMap, 2016
Status of WorldMap, 2016Status of WorldMap, 2016
Status of WorldMap, 2016
 
GeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze UmanitarieGeoNode per il Supporto alle Emergenze Umanitarie
GeoNode per il Supporto alle Emergenze Umanitarie
 
GeoNode intro and demo
GeoNode intro and demoGeoNode intro and demo
GeoNode intro and demo
 
GeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk ReductionGeoNode for Humanitarian Crisis and Risk Reduction
GeoNode for Humanitarian Crisis and Risk Reduction
 
L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...L'utilizzo di software fee and open source nello European Forest Fire Informa...
L'utilizzo di software fee and open source nello European Forest Fire Informa...
 
Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...Fire news management in the context of the European Forest Fire Information S...
Fire news management in the context of the European Forest Fire Information S...
 
Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1Developing Geospatial software with Python, Part 1
Developing Geospatial software with Python, Part 1
 

Kürzlich hochgeladen

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 

Kürzlich hochgeladen (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 

Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures

  • 1. Implementing an Open Source Spatiotemporal Search Platform for Spatial Data Infrastructures OGRS 2016, Perugia, Italy - 10/13/2016 Paolo Corti [1] , Benjamin Lewis [1] , Athanasios Tom Kralidis [2] , Ntabathia Jude Mwenda [1] [1] Harvard Center for Geographic Analysis [2] Open Source Geospatial Foundation
  • 2. HHypermap (Harvard Hypermap) ● With Funding from the National Endowment for the Humanities, the Harvard Centre for Geographic Analysis (CGA) developed HHypermap, a map services registry and search platform ● HHypermap was developed in the process of re-engineering the search component of CGA’s public domain SDI (WorldMap http://worldmap.harvard.edu), based on GeoNode ● It is built on an open source software stack
  • 3. A brief history WorldMap was developed on GeoNode 1.2 and released in 2012 as a public space for scholars and the public to upload and share spatial data. Within a year WorldMap had 12,000 datasets and 8000 users. Finding data became difficult and demand grew for being able to bring in and save map service layers from other servers. The CGA proposed to NEH to build a system for building and maintaining a comprehensive registry of WMS and Esri REST Map services that would plug into Worldmap. Being able to search for data by time and space as well as by keyword was a priority given the user base.
  • 4. Note on uptake Since the release of HHypermap on GitHub in April, a U.S. federal agency has adopted it and Boundless is using it within its flagship platform Boundless Exchange. The geo-visualization capabilities we developed (2D faceting in Lucene) are highly scalable and being used to build a “Billion Object Platform” to enable interactive exploration of a billion spatio-temporal objects.
  • 5. Glossary ● Spatial Data Infrastructure (SDI) ● Catalogue Service for the Web (CSW) ● Search Engine
  • 6. Spatial Data Infrastructure (SDI) A Spatial Data Infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information
  • 7. Catalogue Service for the Web (CSW) ● One of the key software components of an SDI is the catalogue service which is needed to discover, query, and manage the metadata ● Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard which defines common interfaces for accessing the metadata information ● Notable implementations: pycsw, GeoNetwork
  • 8. Search Engine ● A search engine is a software system capable of supporting fast and reliable search ● It provides features such as full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting ● Highly scalable and replicable architecture ● Notable implementations: Solr and ElasticSearch, both based on Apache Lucene
  • 9. Goals of HHypermap ● Provide a framework for building and maintaining a comprehensive registry of web map services ● Support modern search capabilities such as spatial and temporal faceting and instant previews via an open API ● Behind the scenes HHypermap scalably harvests OGC and Esri service metadata from distributed servers, organizes that information, and pushes it to a search engine ● Monitor services and layers for reliability and use to improve results ranking ● End users can search the SDI metadata using standard interfaces provided by the internal CSW catalogue, and benefit from advanced search capabilities provided by a more full featured, RESTful API
  • 10. Catalogue Service for the Web ● The OGC Catalogue Service for the Web (CSW) standard specify the interfaces and bindings, as well as a framework for defining the application profiles required to publish and access digital catalogues of metadata for geospatial data and services ● Based on the Dublin Core metadata information model, CSW supports broad interoperability around discovering geospatial data and services spatially, non-spatially, temporally, and via keywords or freetext ● CSW supports application profiles which allow for information communities to constrain and/or extend the CSW specification to satisfy specific discovery requirements and to realize tighter coupling and integration of geospatial data and services
  • 11. Limitations of CSW CSW provides numerous benefits to SDI’s, but there are numerous opportunities to enhance the functionality of CSW and the server implementations of CSW by adding in standard search engine features. Some examples: ● Faceted search ● JSON representation (vs XML in CSW) ● Simplified query interface (CSW, being based on XML, can quickly become complex) ● Text stemming (ability to detect words derived from a common root) ● Highly scalable and replicable architecture
  • 12. The Need for Search Engines in Spatial Data Infrastructure ● Numerous types of web application such as CMS, Wikis, data delivery frameworks, all benefit from improved data discovery ● In the last few years, these applications have delegated the task of search optimization to specific frameworks known as search engines ● Rather than implementing a custom search logic, these platforms now often add a search engine in the stack to improve search ● Apache Solr and Elasticsearch, two popular open source search engine web platforms, and both based on Apache Lucene, can now be part of a typical CMS stack to support complex search criteria, faceting, result highlighting, query spellcheck, relevance tuning and more ● As for CMS, SDI search can dramatically benefit if paired with these platforms
  • 13. How a search engine works Two distinct phases: ● Indexing: all of the documents (metadata, in the SDI context) that must be searched are scanned, and a list of search terms (an index) is built. For each search term, the index keeps track only of the identifiers of the documents that contain the search term ● Searching: only the index is looked at, and a list of the documents containing the given search term is quickly returned to the client. This indexed approach makes a search engine extremely fast in outputting results
  • 14. Benefits from search engine frameworks ● Very fast, thanks to the indexing mechanism ● Handling the ambiguities of natural languages, with stop words (words filtered out during the processing of text), stemming (ability to detect words derived from a common root), synonyms detection, and controlled vocabularies such as thesauri and taxonomies ● Phrase searches and proximity searches (search for a phrase containing two different words separated by a specified number of words) ● Weighted results ● Handling regular expressions, wildcard search, and fuzzy search to provide results for a given term and its common variations ● Support for boolean queries (AND, OR, NOT) ● Hit highlighting ● Highly scalable and replicable
  • 15. Faceted search ● Faceting is the arrangement of search results in categories based on indexed terms ● This capability makes it possible for example, to provide an immediate indication of the number of times that common keywords are contained in different metadata documents ● A typical use case for SDI is with metadata categories, keywords and regions ● Faceting without a search engine is generally computationally expensive in relational normalized structures (lots of query in a RDBMS)
  • 16. Temporal faceting ● Search engines can also support temporal and spatial faceting, two features that are extremely useful for browsing large collections of geospatial metadata. ● Temporal faceting can display the number of metadata documents in a SDI by date range as a kind of histogram
  • 17. Spatial faceting ● Spatial faceting can provide a spatial surface representing the distribution of layers or features across an area of interest ● In this example a heatmap is generated by spatial faceting to show the distribution of layers in the WorldMap SDI for a given geographic region
  • 18. HHypermap: an SDI search engine based on Free and Open Source Software ● HHypermap is an application that manages OGC web services (such as WMS, WMTS), and Esri REST endpoints ● map service crawling, and harvesting, and uptime statistics gathering for remote services and layers ● The aim of HHypermap is to provide a more effective search experience to WorldMap users and also for users outside WorldMap ● WorldMap is an open source mapping platform, based on GeoNode, developed by the CGA to lower the barrier for scholars who wish to explore, visualize, edit and publish geospatial information
  • 19. HHypermap Architecture Built on Open Source software: Celery, RabbitMQ, Django, Lucene (Solr, Elasticsearch), MapProxy, Memcached, OWSLib, PostgreSQL, PostGIS, pycsw
  • 20. Future Work ● While the CSW 3.0.0 standard provides improvements to address mass market search/discovery, the benefits of search engine implementations combined with broad interoperability of the CSW standard present opportunities to integrate the CSW standard with search engine methodologies ● The authors hope that such an approach will become formalized as a CSW Application Profile or Best Practice in order to achieve maximum benefit and adoption in SDI activities ● This will allow CSW implementations to make better use of search engine methodologies for improving the user search experience in SDI workflows
  • 21. Conclusion HHypermap aims to provide a FOSS solution using modern approaches to realize a highly scalable, flexible and robust geospatial registry and catalogue/search platform while achieving broad interoperability via open standards
  • 22. References ● Harvard University CGA: http://gis.harvard.edu/ ● WorldMap: http://worldmap.harvard.edu/ ● Harvard Hypermap public registry: http://hh.worldmap.harvard.edu/ ● HHypermap code repository: https://github.com/cga-harvard/HHypermap