Linked (Geo) Data - Adding a Spatial Dimension to the Web of Data
1. Linked (Geo) Data
Adding a Spatial Dimension to the Web of Data
http://linkedgeodata.org
Claus Stadler, Sören Auer
AKSW Research Group
University of Leipzig
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 1
2. Structure
● Linked Data
● LinkedGeoData
● Architecture
● Mapping to RDF/OWL
● Interlinking
● Applications
● Issues & Future work
● Open Governmental Geo Data
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 2
3. The emerging Web of Linked Data
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 3
4. Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 4
5. LinkedGeoData
Conversion, interlinking and publishing of
OpenStreetMap.org data sets as RDF.
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 5
6. Motivation
● Ease information integration tasks that require spatial
knowledge, such as
● Offerings of bakeries next door
● Cinemas nearby and their programs
● Historical sights along a bicycle track
● Therefore use RDF/OWL in order overcome structural and semantic
heterogeneity.
● LOD cloud contains data sets with spatial features
● e.g. GeoNames, DBpedia, US census, EuroStat
● But: they are restricted to popular or large entities like countries, famous
places etc.
● OSM offers buildings, roads, mailboxes, trash bins/recyling, ...
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 6
7. What do we offer?
● Very large RDF knowledge base derived from OpenStreetMap
● REST API, static & live SPARQL endpoints
● Update propagation of the added and removed triples
● 845.000 Interlinks with DBpedia, GeoNames, and FAO
● LGD-Browser
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 7
8. Architecture
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 8
9. OpenStreetMap - Datamodel
● Basic entities are:
● Nodes Latitude, Longitude
● Ways Sequence of nodes
● Relations Associations between any number of nodes, ways and relations.
● Each entity may be described with tags:
key-value pairs, such as (amenity, school)
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 9
10. Example: Leipzig's Zoo
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 10
11. OSM entities → RDF
● Three types of mappings based on (osm-id, key, value)
● Text
– (5, name, Leipzig) → lgd:node5 rdfs:label ”Leipzig”
– (5, name:de, Leipzig) → lgd:node5 rdfs:label ”Leipzig”@de
● Datatype properties
– (6, seats, 4) → lgd:node6 lgdo:seats ”4”^^xsd:integer
● Classes/object properties
– (7, amenity, school) → lgd:node7 a lgdo:School
– (8, amenity, rare value) → lgd:node8 a Amenity
– (9, religion, hindi) → lgd:node9 lgdo:religion lgdo:Hindi
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 11
12. Ontology
● Text, class, and object property mappings manually
crafted
● Subclass relations inferred from class-mappings:
(..., amenity, school) → ... a School
(..., amenity, rare value) → ... a Amenity
School subClassOf Amenity
● Automatically mapped OSM keys with mostly
numerical values to datatype properties
● boolean, integer, float
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 12
13. Ontology
● Enriched classes and properties with multilingual labels
from TranslateWiki
● http://translatewiki.net
● Imported icons for 90 classes from the freely available
icon collection from the SJJB Management
● http://www.sjjb.co.uk/mapicons/
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 13
14. Ontology
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 14
15. Instance Data
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 15
16. Statistics (2011-Feb-23)
● 222.539.712 Triples
● 6.666.865 Ways
● 5.882.306 Nodes
● Among them
● 352.673 PlaceOfWorship
● 60.573 RailwayStation
● 59.468 Recycling
● 50.955 Town
● 30.099 Toilet
● 7.222 City
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 16
17. DBpedia Mapping
Given a point from DBpedia, GeoNames, …, query LGD
points with certain types within a maximum distance
Basic idea (performed with Silk):
● Find most similar type(s) in LGD for a
type in another knowledge base
● Compute spatial score
● Compute name similarity (rdfs:label)
● Combine both scores
● Depending on final score, either
automatically accept/reject links or
mark for manual verification.
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 17
18. How to access the data?
● Rest Interface (based on Postgis DB, full osm dataset loaded, > 1billion
triples)
● Supports limited queries (e.g. circular/rectangular area, filtering by labels)
● Sparql Endpoints (based on Virtuoso DB, subset of osm dataset loaded,
~222mio triples)
● Static (http://linkedgeodata.org/sparql)
● Live (http://live.linkedgeodata.org/sparql)
● Downloads (same data as in the SPARQL endpoints,
http://downloads.linkedgeodata.org)
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 18
19. Applications: LinkedGeoData Browser
http://browser.linkedgeodata.org
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 19
20. Applications: Generic spatial data
browsing widgets
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 20
21. Applications: MovieGoer
● MovieGoer
– Scrapes websites with information
about the programs of cinemas in
Innsbruck and Munich
– Interlinked with LinkedGeoData in
order to obtain address information
– During development: missing
cinemas added to OSM; became
available for interlinking through
LGD-Live.
http://lokino.sti2.at/
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 21
22. More Applications
● Layar
Augmented reality browser for mobile phones, features
a LGD layer
http://layar.com
● Vicibit
Generates embeddable HTML code for displaying a
map of POIs of certain types in a specified area
http://vicibit.linkedgeodata.org
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 22
23. Issues & Future work
● Converting all OSM data to RDF not feasible:
> 1.000.000.000 nodes
→ only a subset of OSM's data available via the
SPARQL endpoints
● Virtuoso only supports point geometries
● Possible solution: Rewriting SPARQL queries to
the OSM SQL schema
● Work in progress, no estimate on a release yet
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 23
24. Issues and Future Work
● Investigating further applications of LGD, such as
using the LGD knowlegde base for named entity
recognition/resolution
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 24
25. Open Governmental Geo Data
● Most governmental data has a spatial dimension
● transportation, statistics, finances, etc.
● Geo Data is a driver in the Open Government Data
realm
● ordenance survey, geodata.gov.gr, …
● It is still difficult to extract, publish, interlink, author and
visualize Governmental Linked Geo Data
● LOD2 Stack (ckan integration, browsing widgets,
integrated comprehensive tool compilation)
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 25
26. Best practices
● Reuse of vocabularies
● vCard: address information, phone & fax
● Good Relations: Opening hours, product descriptions
● GeoOWL: latitude, longitude, polygons
– GeoSPARQL Ontology
● Foaf: based_near
● Dublin core (dc): Document meta data
● …
http://www.leipzig.de/de/buerger/jugend/betreuung/kitas/
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 26
27. Transportation
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 27
28. Comprehensive Knowledge Archive
Network (CKAN)
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 28
30. The End
Thanks for your
Attention!
Universität Leipzig ▪ Agile Knowledge Engineering and Semantic Web (AKSW) Authors: Sören Auer, Jens Lehmann, Claus Stadler Slide 30