SlideShare ist ein Scribd-Unternehmen logo
1 von 96
Experiences in the
Development of
Geographical Ontologies
and Linked Data
OntoGeo Workhop, Toulouse, 18 November 2010
Oscar Corcho, Luis Manuel Vilches Blázquez, José Angel Ramos
Gargantilla {ocorcho,lmvilches,jramos}@fi.upm.es
Ontology Engineering Group, Departamento de Inteligencia Artificial,
Facultad de Informática, Universidad Politécnica de Madrid
Credits: Asunción Gómez-Pérez, María del Carmen Suárez de Figueroa, Boris Villazón,
Alex de León, Víctor Saquicela, Miguel Angel García, Juan Sequeda and many others
Work distributed under the license Creative Commons Attribution-
Noncommercial-Share Alike 3.0
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
CG
NGG
BCN200
BCN25
PhenomenOntology, hydrOntology
Our main goal: Data Integration
Step 1: Building
PhenomenOntology
Step 2: Mappings
between the catalogues
and the Ontology
• Great variety of sources
• Near 20 different producers in Spain (national and local
cartographic institutions with different interest)
• Various degrees of quality and structuring of
information
• Natural language ambiguity
• Synonymy, polysemy and hyperonymy
• Scale factor
Why ontologies? Geographical Information Context
Different producers have different vocabularies
• Great variety of sources
• Various degrees of quality and structuring of
information
• ICC has 49 types of features in total
• IGN has (only in the hydrographic domain) 40 types of
features
• Natural language ambiguity
• Synonymy, polysemy and hyperonymy
• Scale factor
Why ontologies? Geographical Information Context
Feature Catalogues
Base Cartográfica N. (BCN200)
Base Cartográfica N. (BCN25)
• Great variety of sources
• Various degrees of quality and structuring of
information
• Natural language ambiguity
• Synonymy: Different words with the same meaning
» riverside, river bank
• Polysemy: Same word with different meanings. Bank
» Bank: Financial institution
» Bank: Relay upon (trust)
• Hyperonymy: One word includes other.
» Bank and Morgan Bank
• Scale factor
Why ontologies? Geographical Information Context
• Great variety of sources
• Various degrees of quality and structuring of
information
• Natural language ambiguity
• Synonymy, polysemy and hyperonymy
• Scale factor
• E.g., one village may be represented as a point X,Y or as an
area XN,YN
• This can act as a filter for geographical information
• Different scales normally present different features
• Generalisation processes are normally a problem, due to
the difficulties in finding “feature overlaps” in different
feature catalogues
Why ontologies? Geographical Information Context
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
O. Specification O. Conceptualization O. ImplementationO. Formalization
1
RDF(S)
OWL
Flogic
NeOn Scenarios
Ontology Restructuring
(Pruning, Extension,
Specialization, Modularization)
8
O. Localization
9
Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation;
Configuration Management; Evaluation (V&V); Assessment
1,2,3,4,5,6,7,8, 9
O. Aligning
O. Merging
Alignments5
5
5
Ontological Resource
Reengineering
4
4
4
6
6
6
6
Knowledge Resources
Ontological Resources
O. Design Patterns
2
Non Ontological Resources
Thesauri
DictionariesGlossaries Lexicons
Taxonomies
Classification
Schemas
Non Ontological Resource
Reuse
Non Ontological Resource
Reengineering
2
2
O. Repositories and Registries
Flogic
RDF(S)
OWL
Ontology Design
Pattern Reuse
7
3
Ontological Resource
Reuse
3
NeOn Scenarios
1. Building ontology networks from scratch without reusing existing
resources.
2. Building ontology networks by reusing and reengineering non
ontological resources.
3. Building ontology networks by reusing ontologies or ontology
modules.
4. Building ontology networks by reusing and reengineering ontologies
or ontology modules.
5. Building ontology networks by reusing and merging ontology or
ontology modules.
6. Building ontology networks by reusing, merging and reengineering
ontologies or ontology modules.
7. Building ontology networks by reusing ontology design patterns.
8. Building ontology networks by restructuring ontologies or ontology
modules.
9. Building ontology networks by localizing ontologies or ontology
modules.
NeOn Methodology
Process and activities covered:
 Ontology Specification
 Scheduling
 Non Ontological Resource Reuse
 Non Ontological Resource Reengineering
 Reuse General Ontologies
 Reuse Domain Ontologies
 Reuse Ontology Statements
 Reuse Ontology Design Patterns
All processes and activities are described with:
 A filling card
 A workflow
 Examples
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
Hydrontology Development
NeOn Methodology for Building Ontology Networks: Specification, Scheduling and Reuse
María del Carmen Suárez de Figueroa Baonza
• One of the INSPIRE aims is to harmonise
Geographical information sources to give support to
formulating, implementing and evaluating EU policies
(e.g., Environmental Management).
• Geographical Information Sources: Databases from
EU State Members at local, regional, national and
international levels.
INSPIRE as a context for hydrontology
Luis Manuel Vilches Blázquez
INSPIRE - Annexes
Luis Manuel Vilches Blázquez
Information Sources
GEMET
Feature Catalogues
BCN25
BCN200
EGM & ERM
CC.AA.
Nomenclátor Geográfico Nacional
Thesauri and Bibliography
WFD
Nomenclátor Conciso
Dictionaries and
Monographs
FTT ADL Getty
Luis Manuel Vilches Blázquez
• Glossary of hydrOntology terms.
• Feature Catalogues of the Numerical Cartographic Database
(1:25.000; 1:200.000; 1:1.000.000)
• Different Feature Catalogue from other local producers.
• EuroGlobalMap & EuroRegionalMap
• Water Framework Directive
• Alexandria Digital Library, Dewey
• Thesauri (UNESCO, GEMET, Getty Thesaurus of Geographic
Names, etc.)
• National Geographic Gazetteer
• Bibliography (Dictionary, Water, Law, etc.)
• This glossary contains more than 120 concepts
Criteria for structuring
• Abstracts concepts from:
• Water Framework Directive
• Proposed by the EU Parliament and EU Council
• List of hydrographic phenomena definition
• Part of the model from:
• SDIGER Project
• INSPIRE pilot project
• Two river basins, two countries, two languages
• Several semantic criteria from:
• WordNet
• Encyclopaedia Britannica
• Diccionario de la Real Academia de la Lengua
• Wikipedia
• Several domain references
• Inheritance: From various actual catalogues
• Meetings with domain experts that belong to IGN-E
Ontology Development
hasStatisticalData
on
Ontology
Specification
Legend
hydrOntology
4
FAO
FAO
Geopolitical
ontology
WGS84
4W3C
Vocabulary
GML
4GML
Specification
O.
Statistics
SCOVO
O.
Time
W3C Time
hasLat/Long
hasGeometry
hasLat/Long
hasGeometry
hasLocation/isLocated
Thesaurus
UNESCO
4EGM / ERM
GeoNames
…
scv:Dimension
scv:Item
scv:Dataset
WGS84 Geo
Positioning: an
RDF vocabulary
hydrographical
phenomena
(rivers, lakes, etc.)
Ontology for OGC
Geography Markup
Language
Vocabulary for
instants, intervals,
durations, etc.
Names and
international
code systems
for territories
and groups
Modelling the hydrology domain
Nivel superior
Nivel inferior
150+ classes, 47 object properties, 64 data properties and 256 axioms.
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
Phenomenontology Development
NeOn Methodology for Building Ontology Networks: Specification, Scheduling and Reuse
María del Carmen Suárez de Figueroa Baonza
Knowledge Bases
Conciso Gazetteer
National Geographic Gazetteer
Numerical Cartographic Database (BCN200)
Numerical Cartographic Database (BCN25)
Knowledge Bases
• National Geographic Gazetteer
has 14 item types and 460,000
toponyms (Spanish, Galician,
Basque, Catalan, and Aranes).
• Conciso Gazetteer, which is
agreed with the United Nations
Conferences
Recommendations on
Geographic Names
Normalization, has 17 item
types and 3667 toponyms.
Conciso Gazetteer
• Gazetteer is a directory of instances of a
class or classes of features than contain
some information regarding position (ISO
19112)
National Geographic Gazetteer
Knowledge Bases
• BCN25 was designed as a derived
product from National
Topographic Map and this was
built to obtain cartographic
information that complies with the
required data specifications
exploited inside GIS.
• BCN200 was developed through
analogical map digitalisation of
provincial maps.
• Information is structured in 8
topics (Administrative boundaries,
Relief, Hydrography, Vegetation
and so on)
• Feature catalogue presents the
abstraction of reality, represented in
one or more sets of geographic
data, as a defined classification of
phenomena (ISO 19110)
Numerical Cartographic
Database (BCN25)
Numerical Cartographic
Database (BCN200)
Catalogue columns:
- Group:
0- unfixed
1- road
...
- Code: 3 pair of digits
XXYYZZ
060101
06 Transportation
01 Roads
01 Highway. Axis
- Name:
Highway. Axis
Highway under construction. Axis
...
BCN25 details
Bottom-up process: PhenomenOntology
• Automatic ontology building from
BCN25/BTN25
BCN25/BTN25
• Automatic checking of linguistic differences (linsearch): plurals,
punctuation marks, capital letters and Spanish signs
• Curation process by expert domain of IGN-E
PhenomenOntology
Criteria for taxonomy creation
• Group (Road, Hydrographic...)
• Code column
• (Topic) - (030501)
• (Group) – (030501)
• (Subgroup) – (030501)
• Common lexical parts
• Highway with 2 lines
• Highway with 3 lines
• Highway under construction
• Highway (superclass)
• Lexical heterogeneity in
feature names (“Autovía”,
“AUTOVIA”, “Autovia”,
“Autovía-”)
Numerical Cartographic
Database (BCN25)
BCN25  BTN25
Base Cartográfica N. (BCN25)
BCN25  PhenomenOntology v3.5
03 ¿?
- Componente de río
• Eje
• Margen
• Eje conexión
- Régimen
• Permanente
• No permanente
- Categoría del río
• Desconocida
• Primera
• Segunda
• Tercera
• Cuarta
- Componente del cauce
artificial
• Eje
• Margen
• Eje conexión
- Situación
• Desconocido
• Subterráneo
• Superficial
• Elevado
0301 Río 0304 Cauce artificial
• Homogeneising URIs and labels
• Exploiting “type” hierarchies
• Reducing unnecessary attributes
• Incorporating BTN25 definitions as rdfs:comments
Ontology curation
Luis Manuel Vilches Blázquez
35Ontological Engineering Group
Homogeneising URIs and labels
- Meaningless labels from the first level in the hierarchy
36Ontological Engineering Group
Homogeneising URIs and labels
- All class and property names in lowercase
37Ontological Engineering Group
Homogeneising URIs and labels
- Spaces and accents in URIs
38Ontological Engineering Group
Exploiting “type” hierarchies
Attribute “type” normally corresponds to additional taxonomies
39Ontological Engineering Group
Reducing unnecessary/redundant attributes
40Ontological Engineering Group
Completing documentation
Some statistics (from BCN25 to BTN25)
PhenomenOntology 4.0PhenomenOntology 3.6
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
• Generic ontology development methodologies can be
applied with some success
• Hydrontology took a total of 6PM approximately
• Initially done by a domain expert after very initial training
• Ontology debugging was extremely difficult and has provided
interesting results in this area
• Top down vs bottom up approaches
• Large curation process still needed in bottom-up
approaches, which may not advise following it (research
ongoing on this)
• More lightweight ontologies with bottom-up approach,
although easier to relate to underlying catalogues
• Next steps on relating them to upper-level ontologies
(e.g., Dolce) and modularising for improving
reusability
Some conclusions in ontology development
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
What is the Web of Linked Data?
• An extension of the current
Web…
• … where information and services
are given well-defined and explicitly
represented meaning, …
• … so that it can be shared and used
by humans and machines, ...
• ... better enabling them to work in
cooperation
• How?
• Promoting information exchange by
tagging web content with machine
processable descriptions of its
meaning.
• And technologies and infrastructure
to do this
• And clear principles on how to
publish data
data
What is Linked Data?
• Linked Data is a term used to describe a
recommended best practice for exposing, sharing,
and connecting pieces of data, information, and
knowledge on the Semantic Web using URIs and
RDF.
• Part of the Semantic Web
• Exposing, sharing and connecting data
• Technologies: URIs and RDF (although others are also
important)
The four principles (Tim Berners Lee, 2006)
1. Use URIs as names
for things
2. Use HTTP URIs so
that people can look
up those names.
3. When someone looks
up a URI, provide
useful information,
using the standards
(RDF*, SPARQL)
4. Include links to other
URIs, so that they can
discover more things.
• http://www.w3.org/D
esignIssues/Linked
Data.html
47
http://www.ted.com/talks/tim_berners_lee_on_the_next_web.htm
Linked Open Data evolution
 2007
 2008
 2009
LOD clouds
Linked Open Data Evolution
50
How should we publish data?
• Formats in which data is published nowadays…
• XML
• HTML
• DBs
• APIs
• CSV
• XLS
• …
• However, main limitations from a Web of Data point
of view
• Difficult to integrate
• Data is not linked to each other, as it happens with Web
documents.
How do we publish Linked Data?
1. Exposing Relational Databases or other similar formats
into Linked Data
• D2R
• Triplify
• R2O
• NOR2O
• Virtuoso
• Ultrawrap
• …
2. Using native RDF triplestores
• Sesame
• Jena
• Owlim
• Talis platform
• …
3. Incorporating it in the form of RDFa in CMSs like Drupal
52
How do we consume Linked Data?
• Linked Data browsers
• To explore things and datasets and to navigate between them.
• Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE),
OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser
(Zitgist, USA), Disco Hyperdata Browser (FU Berlin, DE),
Fenfire (DERI, Ireland)
• Linked Data mashups
• Sites that mash up (thus combine Linked data)
• Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK),
DBPedia Mobile (FU Berlin, DE), Semantic Web Pipes (DERI,
Ireland)
• Search engines
• To search for Linked Data.
• Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch
(Yahoo, Spain), Watson (Open University, UK), SWSE (DERI,
Ireland), Swoogle (UMBC, USA)
53
Listing on this slide by T. Heath, M. Hausenblas, C. Bizer, R. Cyganiak, O. Hartig
One additional motivation: Open Government
• Government and state administration should be
opened at all levels to effective public scrutiny and
oversight
• Objectives:
• Transparency
• Participation
• Collaboration
• Inclusion
• Cost reduction
• Interoperability
• Reusability
• Leadership
• Market & Value
54
•Some Links:
• B. Obama –Transparency and Open
Government
• T. Berners-Lee - Raw data now!
• J. Manuel Alonso - ¿Qué es Open Data?
• Open Government Data
• 8 Principles of Open Government Data
Open Government. USA and UK
55
Linked Data Mashup (data.gov)
• Clean Air Status and Trends (CASTNET)
• http://data-gov.tw.rpi.edu/demo/exhibit/demo-8-castnet.php
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
GeoLinkedData
• It is an open initiative whose aim is to enrich the Web
of Data with Spanish geospatial data.
• This initiative has started off by publishing diverse
information sources, such as National Geographic
Institute of Spain (IGN-E) and National Statistics
Institute (INE)
• http://geo.linkeddata.es
Motivation
» 99.171 % English
» 0.019 % Spanish
Source:Billion Triples dataset at http://km.aifb.kit.edu/projects/btc-2010/
Thanks to Aidan and Richard
The Web of Data is mainly for
English speakers
Poor presence of Spanish
Related Work
Impact of geo.linkeddata.es
• Number of triples in Spanish (July 2010): 1.412.248
• Number of triples in Spanish (September 2010):
21.463.088
61Asunción Gómez Pérez
Before geo.linkeddata.es
en 99,1712875
ja 0,463849377
fr 0,05447229
de 0,034225134
pl 0,02532934
it 0,021982542
es 0,019584648
After geo.linkeddata.es
en 94,18744941
es 5,044085342
ja 0,440538697
fr 0,051734793
de 0,032505155
pl 0,024056418
it 0,020877812
Process for Publishing Linked Data on the Web
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
1. Identification and selection of the data sources
Instituto Geográfico
Nacional
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
Basque
Catalan
Galician
Spanish
1. Identification and selection of the data sources
Instituto Nacional
de Estadística
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
Province
Year
2. Vocabulary development
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
2. Vocabulary development
• Features
• Lightweight :
• Taxonomies and a few properties
• Consensuated vocabularies
• To avoid the mapping problems
• Multilingual
• Linked data are multilingual
• The NeOn methodology can help to
• Re-enginer Non ontological resources into ontologie
• Pros: use domain terminology already
consensuated by domain experts
• Withdraw in heavyweight ontologies those features
that you don’t need
• Reuse existing vocabularies
66Asunción Gómez Pérez
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
Vocabulary development: Specification
• Content requirements: Identify the set of questions
that the ontology should answer
• Which one are the provinces in Spain?
• Where are the beaches?
• Where are the reservoirs?
• Identify the production index in Madrid
• Which one is the city with higher production index?
• Give me Madrid latitude and altitude
• ….
• Non-content requirements
• The ontology must be in the four official Spanish languages
67Asunción Gómez Pérez
2. Vocabulary development: HydrOntology
68Asunción Gómez Pérez
3. Generation of RDF
• From the Data
sources
• Geographic
information
(Databases)
• Statistic information
(spreadsheets)
• Geospatial information
• Different technologies
for RDF generation
• Reengineering
patterns
• R20 and ODEMapster
• Geometry generation
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
3. Generation of the RDF Data
INE
NOR2O
ODEMapster
IGN
IGN
Geospatial
column
Geometry2RDF
3. Generation of the RDF Data
• Preliminaries
• Select appropriate URIs
• Difficulties
• Cumbersome URIs in Spanish
• http://geo.linkeddata.es/ontology/Río
• RDF allows UTF-8 characters for URIs
• But, Linked Data URIs has to be URLs as well
• So, non ASCII-US characters have to be %code
• http://geo.linkeddata.es/ontology/R%C3%ADo
3. Generation of the RDF Data / instances
• NOR2O is a software library that implements the transformations
proposed by the Patterns for Re-engineering Non-Ontological
Resources (PR-NOR). Currently we have 16 PR-NORs.
• PR-NORs define a procedure that transforms a Non-Ontological
Resource (NOR) components into ontology elements.
http://ontologydesignpatterns.org/
NOR2O
· Classification schemes
· Thesauri
· Lexicons
NOR2O
FAO Water classification
· Classification scheme
NOR2O Modules
73
3. Generation of the RDF Data – NOR2O
Industry Production Index
Province
Year
NOR2O
3. Generation of the RDF Data – R2O & ODEMapster
• Creation and execution of R2O Mappings
• Check out at http://www.neon-toolkit.org/
3. Generation of the RDF Data
3. Generation of the RDF Data – Geometry2RDF
Oracle STO UTIL package
SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry))
AS Gml311Geometry
FROM "BCN200"."BCN200_0301L_RIO" c
WHERE c.Etiqueta='Arroyo'
3. Generation of the RDF Data – Geometry2RDF
3. Generation of the RDF Data – Geometry2RDF
3. Generation of the RDF data – RDF graphs
• IGN INE
• So far
• 7 RDF Named Graphs
BTN25 BCN200 IPI….
http://geo.linkeddata.es/dataset/IGN/BTN25 http://geo.linkeddata.es/dataset/IGN/BCN200 http://geo.linkeddata.es/dataset/INE/IPI
4. Publication of the RDF Data
SPARQL
Pubby
Linked DataHTML
Virtuoso 6.1.0
Pubby 0.3
Including Provenance
Support
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
4. Publication of the RDF Data
4. Publication of the RDF Data - License
• Data Licenses
• Official license as published in the Spanish official journal
(BOE - Boletín Oficial del Estado)
• Creative Commons options
• GNU Free Documentation License
• Each dataset has its own specific license
• IGN
• INE
5. Data cleansing
• Lack of documentation of the IGN datasets
• Broken links: Spain, IGN resources
• Lack of documentation of the ontology
• Missing english and spanish labels
• Building a spanish ontology and importing
some concepts of other ontology (in
English):
• Importing the English ontology. Add
annotations like a Spanish label to them.
• Importing the English ontology, creating new
concepts and properties with a Spanish name
and map those to the English equivalents.
• Re-declaring the terms of the English ontology
that we need (using the same URI as in the
English ontology), and adding a Spanish label.
• Creating your own class and properties that
model the same things as the English
ontology.
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
6. Linking of the RDF Data
• Silk - A Link Discovery Framework for
the Web of Data
• First set of links: Provinces of Spain
• 86% accuracy
GeoLinkedDataDBPedia Geonames
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
6. Linking of the RDF Data
• http://geo.linkeddata.es/page/Provincia/Granada
86Asunción Gómez Pérez
7. Enable effective discovery
Identification
of the data sources
Vocabulary
development
Generation
of the RDF Data
Publication
of the RDF data
Linking
the RDF data
Data cleansing
Enable effective
discovery
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
Provinces
Industry Production Index – Capital of Province
Rivers
Beaches
Future Work
• Generate more datasets from other domains, e.g.
universities in Spain.
• Identify more links to DBPedia and Geonames.
• Cover complex geometrical information, i.e. not only
Point and LineString-like data; we will also treat
information representation through polygons.
• Why did we start developing Geographical
Ontologies?
• Methodological guidelines for ontology development
• The NeOn Methodology
• The development process for Hydrontology
• The development process for PhenomenOntology
• Why did we start developing Geographical Linked
Data?
• Methodological guidelines for Linked Data generation
• Ontology and Linked Data usage in
http://geo.linkeddata.es/
Structure of my Talk
• Reusable ontologies available for the community
• Well-founded and well documented
• Now working on multilinguality/multiculturality issues
• Work continuing in understanding how to provide debugging
tools for domain experts.
• Reusable tools for geospatial Linked Data generation
• There is still a lack of understanding of how much
benefit we can get from Linked Geographical Data
• Benefits of linking seem to be clear
• But geo-processing is still unsolved in RDF, as well as
geometry representation
General conclusions
Luis Manuel Vilches Blázquez
Experiences in the
Development of
Geographical Ontologies
and Linked Data
OntoGeo Workhop, Toulouse, 18 November 2010
Oscar Corcho, Luis Manuel Vilches Blázquez, José Angel Ramos
Gargantilla {ocorcho,lmvilches,jramos}@fi.upm.es
Ontology Engineering Group, Departamento de Inteligencia Artificial,
Facultad de Informática, Universidad Politécnica de Madrid
Credits: Asunción Gómez-Pérez, María del Carmen Suárez de Figueroa, Boris Villazón,
Alex de León, Víctor Saquicela, Miguel Angel García, Juan Sequeda and many others
Work distributed under the license Creative Commons Attribution-
Noncommercial-Share Alike 3.0

Weitere ähnliche Inhalte

Ähnlich wie Experiences in the Development of Geographical Ontologies and Linked Data

An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseBoris Villazón-Terrazas
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingSeokhwan Kim
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince Smith
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPariadnenetwork
 
BiographyNet: Linking the world of History
BiographyNet: Linking the world of HistoryBiographyNet: Linking the world of History
BiographyNet: Linking the world of HistoryBiographyNet
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Vladimir Alexiev, PhD, PMP
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31Dag Endresen
 
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyJim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyICZN
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHlorna_hughes
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Trish Rose-Sandler
 
Getting to the Repository of the Future Workshop
Getting to the Repository of the Future WorkshopGetting to the Repository of the Future Workshop
Getting to the Repository of the Future WorkshopRepository Fringe
 
Incentives, Integration, and Mediation: Sustainable Practices for Population ...
Incentives, Integration, and Mediation: Sustainable Practices for Population ...Incentives, Integration, and Mediation: Sustainable Practices for Population ...
Incentives, Integration, and Mediation: Sustainable Practices for Population ...Platforma Otwartej Nauki
 
AHRC Digital Transformations theme: the Story So Far
AHRC Digital Transformations theme: the Story So FarAHRC Digital Transformations theme: the Story So Far
AHRC Digital Transformations theme: the Story So FarAndrew Prescott
 
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...Keith.May
 

Ähnlich wie Experiences in the Development of Geographical Ontologies and Linked Data (20)

Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”Tutorial: “How to use ontology repositories and ontology–based services”
Tutorial: “How to use ontology repositories and ontology–based services”
 
Ee bdm ws-v1
Ee bdm ws-v1Ee bdm ws-v1
Ee bdm ws-v1
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic Tracking
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notext
 
Challenges for ontology repositories and applications to biomedicine and agro...
Challenges for ontology repositories and applications to biomedicine and agro...Challenges for ontology repositories and applications to biomedicine and agro...
Challenges for ontology repositories and applications to biomedicine and agro...
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLP
 
BiographyNet: Linking the world of History
BiographyNet: Linking the world of HistoryBiographyNet: Linking the world of History
BiographyNet: Linking the world of History
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31
 
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyJim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
 
Getting to the Repository of the Future Workshop
Getting to the Repository of the Future WorkshopGetting to the Repository of the Future Workshop
Getting to the Repository of the Future Workshop
 
Incentives, Integration, and Mediation: Sustainable Practices for Population ...
Incentives, Integration, and Mediation: Sustainable Practices for Population ...Incentives, Integration, and Mediation: Sustainable Practices for Population ...
Incentives, Integration, and Mediation: Sustainable Practices for Population ...
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
 
AHRC Digital Transformations theme: the Story So Far
AHRC Digital Transformations theme: the Story So FarAHRC Digital Transformations theme: the Story So Far
AHRC Digital Transformations theme: the Story So Far
 
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
 

Mehr von Oscar Corcho

Organisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOrganisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOscar Corcho
 
Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Oscar Corcho
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosAdiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosOscar Corcho
 
Ontology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOntology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOscar Corcho
 
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Oscar Corcho
 
STARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaSTARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaOscar Corcho
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
 
Publishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyPublishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyOscar Corcho
 
An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...Oscar Corcho
 
Linked Statistical Data 101
Linked Statistical Data 101Linked Statistical Data 101
Linked Statistical Data 101Oscar Corcho
 
Aplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMETAplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMET Oscar Corcho
 
Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Oscar Corcho
 
Educando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadEducando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadOscar Corcho
 
STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016Oscar Corcho
 
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaGeneración de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaOscar Corcho
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesOscar Corcho
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Oscar Corcho
 
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Oscar Corcho
 

Mehr von Oscar Corcho (20)

Organisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOrganisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de Madrid
 
Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosAdiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
 
Ontology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOntology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data Sharing
 
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
 
STARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaSTARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación Lumínica
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experience
 
Publishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyPublishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case study
 
An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...
 
Linked Statistical Data 101
Linked Statistical Data 101Linked Statistical Data 101
Linked Statistical Data 101
 
Aplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMETAplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMET
 
Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016
 
Educando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadEducando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidad
 
STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016
 
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaGeneración de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart Cities
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
 

Kürzlich hochgeladen

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Kürzlich hochgeladen (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Experiences in the Development of Geographical Ontologies and Linked Data

  • 1. Experiences in the Development of Geographical Ontologies and Linked Data OntoGeo Workhop, Toulouse, 18 November 2010 Oscar Corcho, Luis Manuel Vilches Blázquez, José Angel Ramos Gargantilla {ocorcho,lmvilches,jramos}@fi.upm.es Ontology Engineering Group, Departamento de Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid Credits: Asunción Gómez-Pérez, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Miguel Angel García, Juan Sequeda and many others Work distributed under the license Creative Commons Attribution- Noncommercial-Share Alike 3.0
  • 2. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 3. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 4. CG NGG BCN200 BCN25 PhenomenOntology, hydrOntology Our main goal: Data Integration Step 1: Building PhenomenOntology Step 2: Mappings between the catalogues and the Ontology
  • 5. • Great variety of sources • Near 20 different producers in Spain (national and local cartographic institutions with different interest) • Various degrees of quality and structuring of information • Natural language ambiguity • Synonymy, polysemy and hyperonymy • Scale factor Why ontologies? Geographical Information Context
  • 6. Different producers have different vocabularies
  • 7. • Great variety of sources • Various degrees of quality and structuring of information • ICC has 49 types of features in total • IGN has (only in the hydrographic domain) 40 types of features • Natural language ambiguity • Synonymy, polysemy and hyperonymy • Scale factor Why ontologies? Geographical Information Context
  • 8. Feature Catalogues Base Cartográfica N. (BCN200) Base Cartográfica N. (BCN25)
  • 9. • Great variety of sources • Various degrees of quality and structuring of information • Natural language ambiguity • Synonymy: Different words with the same meaning » riverside, river bank • Polysemy: Same word with different meanings. Bank » Bank: Financial institution » Bank: Relay upon (trust) • Hyperonymy: One word includes other. » Bank and Morgan Bank • Scale factor Why ontologies? Geographical Information Context
  • 10. • Great variety of sources • Various degrees of quality and structuring of information • Natural language ambiguity • Synonymy, polysemy and hyperonymy • Scale factor • E.g., one village may be represented as a point X,Y or as an area XN,YN • This can act as a filter for geographical information • Different scales normally present different features • Generalisation processes are normally a problem, due to the difficulties in finding “feature overlaps” in different feature catalogues Why ontologies? Geographical Information Context
  • 11. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 12. O. Specification O. Conceptualization O. ImplementationO. Formalization 1 RDF(S) OWL Flogic NeOn Scenarios Ontology Restructuring (Pruning, Extension, Specialization, Modularization) 8 O. Localization 9 Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation; Configuration Management; Evaluation (V&V); Assessment 1,2,3,4,5,6,7,8, 9 O. Aligning O. Merging Alignments5 5 5 Ontological Resource Reengineering 4 4 4 6 6 6 6 Knowledge Resources Ontological Resources O. Design Patterns 2 Non Ontological Resources Thesauri DictionariesGlossaries Lexicons Taxonomies Classification Schemas Non Ontological Resource Reuse Non Ontological Resource Reengineering 2 2 O. Repositories and Registries Flogic RDF(S) OWL Ontology Design Pattern Reuse 7 3 Ontological Resource Reuse 3
  • 13. NeOn Scenarios 1. Building ontology networks from scratch without reusing existing resources. 2. Building ontology networks by reusing and reengineering non ontological resources. 3. Building ontology networks by reusing ontologies or ontology modules. 4. Building ontology networks by reusing and reengineering ontologies or ontology modules. 5. Building ontology networks by reusing and merging ontology or ontology modules. 6. Building ontology networks by reusing, merging and reengineering ontologies or ontology modules. 7. Building ontology networks by reusing ontology design patterns. 8. Building ontology networks by restructuring ontologies or ontology modules. 9. Building ontology networks by localizing ontologies or ontology modules.
  • 14. NeOn Methodology Process and activities covered:  Ontology Specification  Scheduling  Non Ontological Resource Reuse  Non Ontological Resource Reengineering  Reuse General Ontologies  Reuse Domain Ontologies  Reuse Ontology Statements  Reuse Ontology Design Patterns All processes and activities are described with:  A filling card  A workflow  Examples
  • 15. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 16. Hydrontology Development NeOn Methodology for Building Ontology Networks: Specification, Scheduling and Reuse María del Carmen Suárez de Figueroa Baonza
  • 17. • One of the INSPIRE aims is to harmonise Geographical information sources to give support to formulating, implementing and evaluating EU policies (e.g., Environmental Management). • Geographical Information Sources: Databases from EU State Members at local, regional, national and international levels. INSPIRE as a context for hydrontology Luis Manuel Vilches Blázquez
  • 18. INSPIRE - Annexes Luis Manuel Vilches Blázquez
  • 19. Information Sources GEMET Feature Catalogues BCN25 BCN200 EGM & ERM CC.AA. Nomenclátor Geográfico Nacional Thesauri and Bibliography WFD Nomenclátor Conciso Dictionaries and Monographs FTT ADL Getty Luis Manuel Vilches Blázquez
  • 20. • Glossary of hydrOntology terms. • Feature Catalogues of the Numerical Cartographic Database (1:25.000; 1:200.000; 1:1.000.000) • Different Feature Catalogue from other local producers. • EuroGlobalMap & EuroRegionalMap • Water Framework Directive • Alexandria Digital Library, Dewey • Thesauri (UNESCO, GEMET, Getty Thesaurus of Geographic Names, etc.) • National Geographic Gazetteer • Bibliography (Dictionary, Water, Law, etc.) • This glossary contains more than 120 concepts
  • 21. Criteria for structuring • Abstracts concepts from: • Water Framework Directive • Proposed by the EU Parliament and EU Council • List of hydrographic phenomena definition • Part of the model from: • SDIGER Project • INSPIRE pilot project • Two river basins, two countries, two languages • Several semantic criteria from: • WordNet • Encyclopaedia Britannica • Diccionario de la Real Academia de la Lengua • Wikipedia • Several domain references • Inheritance: From various actual catalogues • Meetings with domain experts that belong to IGN-E
  • 22. Ontology Development hasStatisticalData on Ontology Specification Legend hydrOntology 4 FAO FAO Geopolitical ontology WGS84 4W3C Vocabulary GML 4GML Specification O. Statistics SCOVO O. Time W3C Time hasLat/Long hasGeometry hasLat/Long hasGeometry hasLocation/isLocated Thesaurus UNESCO 4EGM / ERM GeoNames … scv:Dimension scv:Item scv:Dataset WGS84 Geo Positioning: an RDF vocabulary hydrographical phenomena (rivers, lakes, etc.) Ontology for OGC Geography Markup Language Vocabulary for instants, intervals, durations, etc. Names and international code systems for territories and groups
  • 23. Modelling the hydrology domain Nivel superior Nivel inferior 150+ classes, 47 object properties, 64 data properties and 256 axioms.
  • 24. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 25. Phenomenontology Development NeOn Methodology for Building Ontology Networks: Specification, Scheduling and Reuse María del Carmen Suárez de Figueroa Baonza
  • 26. Knowledge Bases Conciso Gazetteer National Geographic Gazetteer Numerical Cartographic Database (BCN200) Numerical Cartographic Database (BCN25)
  • 27. Knowledge Bases • National Geographic Gazetteer has 14 item types and 460,000 toponyms (Spanish, Galician, Basque, Catalan, and Aranes). • Conciso Gazetteer, which is agreed with the United Nations Conferences Recommendations on Geographic Names Normalization, has 17 item types and 3667 toponyms. Conciso Gazetteer • Gazetteer is a directory of instances of a class or classes of features than contain some information regarding position (ISO 19112) National Geographic Gazetteer
  • 28. Knowledge Bases • BCN25 was designed as a derived product from National Topographic Map and this was built to obtain cartographic information that complies with the required data specifications exploited inside GIS. • BCN200 was developed through analogical map digitalisation of provincial maps. • Information is structured in 8 topics (Administrative boundaries, Relief, Hydrography, Vegetation and so on) • Feature catalogue presents the abstraction of reality, represented in one or more sets of geographic data, as a defined classification of phenomena (ISO 19110) Numerical Cartographic Database (BCN25) Numerical Cartographic Database (BCN200)
  • 29. Catalogue columns: - Group: 0- unfixed 1- road ... - Code: 3 pair of digits XXYYZZ 060101 06 Transportation 01 Roads 01 Highway. Axis - Name: Highway. Axis Highway under construction. Axis ... BCN25 details
  • 30. Bottom-up process: PhenomenOntology • Automatic ontology building from BCN25/BTN25 BCN25/BTN25 • Automatic checking of linguistic differences (linsearch): plurals, punctuation marks, capital letters and Spanish signs • Curation process by expert domain of IGN-E PhenomenOntology
  • 31. Criteria for taxonomy creation • Group (Road, Hydrographic...) • Code column • (Topic) - (030501) • (Group) – (030501) • (Subgroup) – (030501) • Common lexical parts • Highway with 2 lines • Highway with 3 lines • Highway under construction • Highway (superclass) • Lexical heterogeneity in feature names (“Autovía”, “AUTOVIA”, “Autovia”, “Autovía-”) Numerical Cartographic Database (BCN25)
  • 32. BCN25  BTN25 Base Cartográfica N. (BCN25)
  • 33. BCN25  PhenomenOntology v3.5 03 ¿? - Componente de río • Eje • Margen • Eje conexión - Régimen • Permanente • No permanente - Categoría del río • Desconocida • Primera • Segunda • Tercera • Cuarta - Componente del cauce artificial • Eje • Margen • Eje conexión - Situación • Desconocido • Subterráneo • Superficial • Elevado 0301 Río 0304 Cauce artificial
  • 34. • Homogeneising URIs and labels • Exploiting “type” hierarchies • Reducing unnecessary attributes • Incorporating BTN25 definitions as rdfs:comments Ontology curation Luis Manuel Vilches Blázquez
  • 35. 35Ontological Engineering Group Homogeneising URIs and labels - Meaningless labels from the first level in the hierarchy
  • 36. 36Ontological Engineering Group Homogeneising URIs and labels - All class and property names in lowercase
  • 37. 37Ontological Engineering Group Homogeneising URIs and labels - Spaces and accents in URIs
  • 38. 38Ontological Engineering Group Exploiting “type” hierarchies Attribute “type” normally corresponds to additional taxonomies
  • 39. 39Ontological Engineering Group Reducing unnecessary/redundant attributes
  • 41. Some statistics (from BCN25 to BTN25) PhenomenOntology 4.0PhenomenOntology 3.6
  • 42. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 43. • Generic ontology development methodologies can be applied with some success • Hydrontology took a total of 6PM approximately • Initially done by a domain expert after very initial training • Ontology debugging was extremely difficult and has provided interesting results in this area • Top down vs bottom up approaches • Large curation process still needed in bottom-up approaches, which may not advise following it (research ongoing on this) • More lightweight ontologies with bottom-up approach, although easier to relate to underlying catalogues • Next steps on relating them to upper-level ontologies (e.g., Dolce) and modularising for improving reusability Some conclusions in ontology development
  • 44. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 45. What is the Web of Linked Data? • An extension of the current Web… • … where information and services are given well-defined and explicitly represented meaning, … • … so that it can be shared and used by humans and machines, ... • ... better enabling them to work in cooperation • How? • Promoting information exchange by tagging web content with machine processable descriptions of its meaning. • And technologies and infrastructure to do this • And clear principles on how to publish data data
  • 46. What is Linked Data? • Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF. • Part of the Semantic Web • Exposing, sharing and connecting data • Technologies: URIs and RDF (although others are also important)
  • 47. The four principles (Tim Berners Lee, 2006) 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs, so that they can discover more things. • http://www.w3.org/D esignIssues/Linked Data.html 47 http://www.ted.com/talks/tim_berners_lee_on_the_next_web.htm
  • 48. Linked Open Data evolution  2007  2008  2009
  • 50. Linked Open Data Evolution 50
  • 51. How should we publish data? • Formats in which data is published nowadays… • XML • HTML • DBs • APIs • CSV • XLS • … • However, main limitations from a Web of Data point of view • Difficult to integrate • Data is not linked to each other, as it happens with Web documents.
  • 52. How do we publish Linked Data? 1. Exposing Relational Databases or other similar formats into Linked Data • D2R • Triplify • R2O • NOR2O • Virtuoso • Ultrawrap • … 2. Using native RDF triplestores • Sesame • Jena • Owlim • Talis platform • … 3. Incorporating it in the form of RDFa in CMSs like Drupal 52
  • 53. How do we consume Linked Data? • Linked Data browsers • To explore things and datasets and to navigate between them. • Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco Hyperdata Browser (FU Berlin, DE), Fenfire (DERI, Ireland) • Linked Data mashups • Sites that mash up (thus combine Linked data) • Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK), DBPedia Mobile (FU Berlin, DE), Semantic Web Pipes (DERI, Ireland) • Search engines • To search for Linked Data. • Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch (Yahoo, Spain), Watson (Open University, UK), SWSE (DERI, Ireland), Swoogle (UMBC, USA) 53 Listing on this slide by T. Heath, M. Hausenblas, C. Bizer, R. Cyganiak, O. Hartig
  • 54. One additional motivation: Open Government • Government and state administration should be opened at all levels to effective public scrutiny and oversight • Objectives: • Transparency • Participation • Collaboration • Inclusion • Cost reduction • Interoperability • Reusability • Leadership • Market & Value 54 •Some Links: • B. Obama –Transparency and Open Government • T. Berners-Lee - Raw data now! • J. Manuel Alonso - ¿Qué es Open Data? • Open Government Data • 8 Principles of Open Government Data
  • 55. Open Government. USA and UK 55
  • 56. Linked Data Mashup (data.gov) • Clean Air Status and Trends (CASTNET) • http://data-gov.tw.rpi.edu/demo/exhibit/demo-8-castnet.php
  • 57. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 58. GeoLinkedData • It is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data. • This initiative has started off by publishing diverse information sources, such as National Geographic Institute of Spain (IGN-E) and National Statistics Institute (INE) • http://geo.linkeddata.es
  • 59. Motivation » 99.171 % English » 0.019 % Spanish Source:Billion Triples dataset at http://km.aifb.kit.edu/projects/btc-2010/ Thanks to Aidan and Richard The Web of Data is mainly for English speakers Poor presence of Spanish
  • 61. Impact of geo.linkeddata.es • Number of triples in Spanish (July 2010): 1.412.248 • Number of triples in Spanish (September 2010): 21.463.088 61Asunción Gómez Pérez Before geo.linkeddata.es en 99,1712875 ja 0,463849377 fr 0,05447229 de 0,034225134 pl 0,02532934 it 0,021982542 es 0,019584648 After geo.linkeddata.es en 94,18744941 es 5,044085342 ja 0,440538697 fr 0,051734793 de 0,032505155 pl 0,024056418 it 0,020877812
  • 62. Process for Publishing Linked Data on the Web Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 63. 1. Identification and selection of the data sources Instituto Geográfico Nacional Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery Basque Catalan Galician Spanish
  • 64. 1. Identification and selection of the data sources Instituto Nacional de Estadística Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery Province Year
  • 65. 2. Vocabulary development http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 66. 2. Vocabulary development • Features • Lightweight : • Taxonomies and a few properties • Consensuated vocabularies • To avoid the mapping problems • Multilingual • Linked data are multilingual • The NeOn methodology can help to • Re-enginer Non ontological resources into ontologie • Pros: use domain terminology already consensuated by domain experts • Withdraw in heavyweight ontologies those features that you don’t need • Reuse existing vocabularies 66Asunción Gómez Pérez Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 67. Vocabulary development: Specification • Content requirements: Identify the set of questions that the ontology should answer • Which one are the provinces in Spain? • Where are the beaches? • Where are the reservoirs? • Identify the production index in Madrid • Which one is the city with higher production index? • Give me Madrid latitude and altitude • …. • Non-content requirements • The ontology must be in the four official Spanish languages 67Asunción Gómez Pérez
  • 68. 2. Vocabulary development: HydrOntology 68Asunción Gómez Pérez
  • 69. 3. Generation of RDF • From the Data sources • Geographic information (Databases) • Statistic information (spreadsheets) • Geospatial information • Different technologies for RDF generation • Reengineering patterns • R20 and ODEMapster • Geometry generation Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 70. 3. Generation of the RDF Data INE NOR2O ODEMapster IGN IGN Geospatial column Geometry2RDF
  • 71. 3. Generation of the RDF Data • Preliminaries • Select appropriate URIs • Difficulties • Cumbersome URIs in Spanish • http://geo.linkeddata.es/ontology/Río • RDF allows UTF-8 characters for URIs • But, Linked Data URIs has to be URLs as well • So, non ASCII-US characters have to be %code • http://geo.linkeddata.es/ontology/R%C3%ADo
  • 72. 3. Generation of the RDF Data / instances • NOR2O is a software library that implements the transformations proposed by the Patterns for Re-engineering Non-Ontological Resources (PR-NOR). Currently we have 16 PR-NORs. • PR-NORs define a procedure that transforms a Non-Ontological Resource (NOR) components into ontology elements. http://ontologydesignpatterns.org/ NOR2O · Classification schemes · Thesauri · Lexicons NOR2O FAO Water classification · Classification scheme
  • 74. 3. Generation of the RDF Data – NOR2O Industry Production Index Province Year NOR2O
  • 75. 3. Generation of the RDF Data – R2O & ODEMapster • Creation and execution of R2O Mappings • Check out at http://www.neon-toolkit.org/
  • 76. 3. Generation of the RDF Data
  • 77. 3. Generation of the RDF Data – Geometry2RDF Oracle STO UTIL package SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry)) AS Gml311Geometry FROM "BCN200"."BCN200_0301L_RIO" c WHERE c.Etiqueta='Arroyo'
  • 78. 3. Generation of the RDF Data – Geometry2RDF
  • 79. 3. Generation of the RDF Data – Geometry2RDF
  • 80. 3. Generation of the RDF data – RDF graphs • IGN INE • So far • 7 RDF Named Graphs BTN25 BCN200 IPI…. http://geo.linkeddata.es/dataset/IGN/BTN25 http://geo.linkeddata.es/dataset/IGN/BCN200 http://geo.linkeddata.es/dataset/INE/IPI
  • 81. 4. Publication of the RDF Data SPARQL Pubby Linked DataHTML Virtuoso 6.1.0 Pubby 0.3 Including Provenance Support Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 82. 4. Publication of the RDF Data
  • 83. 4. Publication of the RDF Data - License • Data Licenses • Official license as published in the Spanish official journal (BOE - Boletín Oficial del Estado) • Creative Commons options • GNU Free Documentation License • Each dataset has its own specific license • IGN • INE
  • 84. 5. Data cleansing • Lack of documentation of the IGN datasets • Broken links: Spain, IGN resources • Lack of documentation of the ontology • Missing english and spanish labels • Building a spanish ontology and importing some concepts of other ontology (in English): • Importing the English ontology. Add annotations like a Spanish label to them. • Importing the English ontology, creating new concepts and properties with a Spanish name and map those to the English equivalents. • Re-declaring the terms of the English ontology that we need (using the same URI as in the English ontology), and adding a Spanish label. • Creating your own class and properties that model the same things as the English ontology. Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 85. 6. Linking of the RDF Data • Silk - A Link Discovery Framework for the Web of Data • First set of links: Provinces of Spain • 86% accuracy GeoLinkedDataDBPedia Geonames Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 86. 6. Linking of the RDF Data • http://geo.linkeddata.es/page/Provincia/Granada 86Asunción Gómez Pérez
  • 87. 7. Enable effective discovery Identification of the data sources Vocabulary development Generation of the RDF Data Publication of the RDF data Linking the RDF data Data cleansing Enable effective discovery
  • 88. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 90. Industry Production Index – Capital of Province
  • 93. Future Work • Generate more datasets from other domains, e.g. universities in Spain. • Identify more links to DBPedia and Geonames. • Cover complex geometrical information, i.e. not only Point and LineString-like data; we will also treat information representation through polygons.
  • 94. • Why did we start developing Geographical Ontologies? • Methodological guidelines for ontology development • The NeOn Methodology • The development process for Hydrontology • The development process for PhenomenOntology • Why did we start developing Geographical Linked Data? • Methodological guidelines for Linked Data generation • Ontology and Linked Data usage in http://geo.linkeddata.es/ Structure of my Talk
  • 95. • Reusable ontologies available for the community • Well-founded and well documented • Now working on multilinguality/multiculturality issues • Work continuing in understanding how to provide debugging tools for domain experts. • Reusable tools for geospatial Linked Data generation • There is still a lack of understanding of how much benefit we can get from Linked Geographical Data • Benefits of linking seem to be clear • But geo-processing is still unsolved in RDF, as well as geometry representation General conclusions Luis Manuel Vilches Blázquez
  • 96. Experiences in the Development of Geographical Ontologies and Linked Data OntoGeo Workhop, Toulouse, 18 November 2010 Oscar Corcho, Luis Manuel Vilches Blázquez, José Angel Ramos Gargantilla {ocorcho,lmvilches,jramos}@fi.upm.es Ontology Engineering Group, Departamento de Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid Credits: Asunción Gómez-Pérez, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Miguel Angel García, Juan Sequeda and many others Work distributed under the license Creative Commons Attribution- Noncommercial-Share Alike 3.0