SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Discovering Concept Coverings in
Ontologies of Linked Data Sources
Best Paper: Research Track – The 11th International Semantic
Web Conference, 2012
Rahul Parundekar, Craig A. Knoblock and Jose-Luis Ambite
{parundek,knoblock}@usc.edu, ambite@isi.edu

University of Southern California
MOTIVATION
Data Integration on the Web
Goal: Integrate and Query Data across the Web
Two challenges in data integration:
• Object-level (aka record linkage): When objects at different
sites are the same real-world object
• Linked Data movement addresses object-level integration
• objects have unique URIs,
• equivalent objects linked with owl:sameAs statements

• Schema-level (aka schema/ontology mapping): align the
semantics of different sources
• Define mappings to relate source schemas to common schema
• Use schema mappings to query across multiple sources

Our work: Exploit linked data to automatically
discover expressive schema/ontology mappings
The Web of Linked Data
integrates data at the object level

Example:
Geospatial
Domain

Los
Angeles

City of Los
Angeles
Equivalent instances in different sources
are connected with owl:sameAs
Source 1

Source 2

Ontology Level
Populated
Place

City

Instance Level

owl:sameAs
Los Angeles

City of Los Angeles
Links are absent at the ontology level
Source 1

Source 2

Ontology Level
Populated
Place

Only 15 out of the
NO LINKS!!
190 ontologies
are connected

City

Instance Level

owl:sameAs
Los Angeles

City of Los Angeles
Alignments are necessary for
interoperability of the sources
Source 1

Source 2

Ontology Level
Populated
Place

We need to find
NO LINKS!!
links at the
Ontology Level

City

Instance Level

owl:sameAs
Los Angeles

City of Los Angeles
How can we find ontology alignments?
Source 1

Source 2

Ontology Level
=
Populated
Place

City

Instance Level

owl:sameAs
Los Angeles

City of Los Angeles
DISCOVERING ALIGNMENTS IN
ONTOLOGIES OF LINKED DATA
Use an extensional approach to align concepts

Represents set of instances belonging to ClassA
Represents set of instances belonging to ClassB
ClassA is disjoint from ClassB

ClassA is equivalent to ClassB

ClassA is subset of ClassB

ClassB is subset of ClassA
Align concepts when supported by evidence at
the instance level
Source 1

Source 2

Ontology Level

=
Populated
Place

City

Instance Level
New York

NYC

City of Los
Angeles

Los
Angeles

City of
Dublin

Dublin
Kuala
Lumpur

… and more

Kolumpo

… and more
However, ontologies of many sources are
rudimentary
DBpedia Ontology
Description

GeoNames Ontology

# of Properties

3.77 million

8 million
9 feature classes, 645 feature
codes

359

1

(Well-definedhierarchy)

# ofClasses

Geographical
Database

For Example: Places, People,
Music, Topics, etc.

# of Instances

Semantic Web version
of Wikipedia

(rdf:type=Feature)

1775

29

Rich, Descriptive Ontology

Impoverished Ontology

Finding Alignments is Non-Trivial
Create NEW concepts by restricting values of
properties

Set of all instances in
GeoNames

Set of all instances with
featureClass=P

Atomic Restriction Classes*

Set of all instances in
DBpedia

Set of all instances with
rdf:type=PopulatedPlace

* Value Restrictions in OWL-DL
Comparing Linked Instances Across
Ontologies

| Img(r1) ∩ r2|

P

| Img(r1) |

R

| Img(r1) ∩ r2|
|r2|
Align Atomic Restriction Classes by
comparing their overlap
featureClass=P

rdf:type=PopulatedPlace

=
r1

r2

| Img(r1) ∩ r2|
| Img(r1) |

= > 0.9
1

| Img(r1) ∩ r2|
|r2|

= > 0.9
1
Create specialized concepts using
conjunctive restriction classes
For Example: Creating the concept for “Schools in the US”

Set of all instances in
GeoNames

Set of all instances with
countryCode=US
i.e. features in the US

Set of all instances with
featureCode=S.SCH
&countryCode=US
i.e. Schools in the US

Set of all instances with
featureCode=S.SCH
i.e. Schools

Conjunctive Restriction
Classes
An ordered top-down exploration algorithm to
align Atomic & Conjunctive Restriction Classes
Detect rich alignments even when ontologies are
rudimentary
GeoNames-Dbpedia
Relationship

Equivalent
Subset

# Alignments Found with Atomic
and Conjunctive Restriction
Classes
31
2193

Can we find more
meaningful alignments?
CONSIDER THREE OF THE
SUBSET RELATIONS
PRODUCED…
1) Schools in GeoNames are Educational
Institutions in DBpedia
featureCode=S.SCH

rdf:type=EducationalInstitution
2) Colleges in GeoNames are
Educational Institutions in DBpedia
featureCode=S.SCH

rdf:type=EducationalInstitution

featureCode=S.SCHC
3) Universities in GeoNames are
Educational Institutions in DBpedia
featureCode=S.SCH

rdf:type=EducationalInstitution

featureCode=S.SCHC

featureCode=S.UNIV
Using featureCode property as a hint,
create a Union of concepts
featureCode=S.SCH

rdf:type=EducationalInstitution

featureCode=S.SCHC

featureCode=S.UNIV

featureCode=S.SCHC

∩

∩

featureCode=S.SCH

featureCode=S.UNIV
Detect a Concept Covering by
extensional comparison
featureCode=S.SCH

rdf:type=EducationalInstitution

featureCode=S.SCHC

=
featureCode=S.UNIV

featureCode=S.SCHC

∩

∩

featureCode=S.SCH

featureCode=S.UNIV
Compare the overlap of the extension
sets to determine equivalence
featureCode={S.SCH, S.SCHC, S.UNIV}

US

rdf:type=EducationalInstitution

UL
|UL| = 404 Educational
Institutions

UA=US∩UL

=

| UA |
| US |

> = 1, by definition
0.9, by definition

| UA |
| UL |

=
=1

396

404

= 0.98 > 0.9
DETECTING OUTLIERS
Example: Am I in Spain … or Italy?

• We align dbpedia:country=dbpedia:Spain with
geonames:countryCode=ES
• 3917 out of 3918 instances in GeoNames agree
with this
• ONE instance had its country code as Italy.
• Because this instance contradicts
overwhelming evidence, we can flag it as an
outlier
Concept Covering of Educational Institutions:
What are the other 8 instances?
featureCode={S.SCH, S.SCHC, S.UNIV}
| UA |
| UL |

rdf:type=EducationalInstitution

=

396
404

• 1 with featureCode=S.HSP (Hospitals)
• There are 31 instances with S.HSP because of which
Hospitals are not subsets

•
•
•
•
•

3 with featureCode=S.BLDG (Buildings)
1 with featureCode=S.EST (Establishment)
1 with featureCode=S.LIBR (Library)
1 with featureCode=S.MUS (Museum)
1 doesn’t have a featureCodeproperty
RESULTS
Example alignments of
Atomic Restriction Classes
Restriction Class
from GeoNames

Restriction Class from
DBpedia

Rel

P

R

| Img(r1)
∩ r2|

Alignments with concepts that were not explicit– e.g. Concept ofPlaces

featureClass=P

rdf:type=PopulatedPlace

=

99.
6

90.5 70658

∩

Alignments with geographical regions like countries, administrative divisions, etc.

countryCode=ES

country=Spain

=

94.
5

99.9 3917

Find the actual relationship between concepts as opposed to the perceived one
Example alignments of
Conjunctive Restriction Classes
Restriction Class
from GeoNames

Restriction Class from
DBpedia

Rel

P

R

| Img(r1)
∩ r2|

Find alignments with conjunctiverestriction classes
e.g. Concepts of ‘Places in the US’ are equal
featureClass=P
&countrycode=US

rdf:type=PopulatedPlace&
country=United_States

=

97.
2

96.7 26061

Find alignments with conjunctive restriction classes that have related properties
e.g. Places in North Dakota have 701 area code for phone numbers
featureClass=P &
parentADM1=
North_Dakota

areaCode=701

=

98.
1

96.5 361

In some cases the meaning of a concept shifts slightly
e.g. Populated Places in Senegal are aligned to Towns rather than PopulatedPlaces
featureClass=P
&countryCode=SN

rdf:type=Town &
country=Senegal

=

92.
6

100

25
Example Alignments of
Disjunctive Restriction Classes
Larger Restriction
Class

Union of Smaller Restriction
Classes

Rel

R

Ove Outliers
rlap

Find concept coverings with disjunctive restriction classes– e.g. Educational
Institution concept in Dbpedia covers concepts of Schools, Colleges and Universities
rdf:type= dbpedia:
geonames:featureCode=
Educational_Institution {S.SCH, S.SCHC, S.UNIV}

=

98.
0

396/
404

S.BLDG,
S.HSP,
S.MUS,
etc.

=

99.
2

1981/
1996

S.AIRF,
S.FRMT,
S.SCH,
T.HLL, etc.

98.
0

1939/
1978

dbpedia:
Kingdom_o
f_the_Neth
erlands

System can flag outliers that may need to be corrected
rdf:type=
dbpedia:Airport

Geonames:featureCode=
{S.AIRB, S.AIRP}

System is able to find all terms used for the country Netherlands
geonames:
countryCode=NL

dbpedia:country=
{dbpedia:The_Netherlands,db
pedia:Flag_of_the_Netherland
s.svg, dbpedia:Netherlands}

=
Related Work
• Other Ontology Alignment efforts in the Web of
Linked Data
• BLOOMS, BLOOMS+ [Jain et al. ISWC 2010, 2011]
• Linked Open Data ontologies aligned with central
ontology called ‘Proton’ using structural similarity
• Agreement Maker [Cruz et al. 2011]
• Similarity Metrics on labels of classes
• Statistical schema induction [Volker et al. ISWC 2011]
• Mines associativity rules from intermediate ‘transaction
data sets’ -> OWL2 Axioms.

• Formalization of Ontology Mappings [Atencia et al.
ISWC 2012]
• A related work that provides a formalization of weighted
ontology mappings
Conclusion and Future Work
• Conclusion
• Our approach is able to find alignments
• Automatically, across any two linked sources
• Even in the case of a rudimentary ontology
• Types of Alignments
• Atomic Restriction Classes
• Conjunctive Restriction Classes
• Disjunctive Restriction Classes i.e. Concept Coverings

• And detect Outliers that help identify inconsistencies in the
data

• Future work
• Add support for negation
• Build complete descriptions of sources
• Use algorithms to negotiate meaning between agents on the fly
References for Additional Detail:

Any questions?

THANK YOU

Rahul Parundekar, Craig A. Knoblock, and
Jose Luis Ambite.
Linking and building ontologies of linked data.
The Semantic Web, ISWC 2010.
Rahul Parundekar, Craig A. Knoblock, and
Jose Luis Ambite.
Discovering concept coverings in ontologies of
linked data sources
The Semantic Web, ISWC 2012.
Rahul Parundekar, Craig A. Knoblock, and
Jose Luis Ambite.
Discovering alignments in ontologies of linked
data
IJCAI-2013

Weitere ähnliche Inhalte

Was ist angesagt?

Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...
తేజ దండిభట్ల
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
Lihua Zhao
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
FELIX75
 

Was ist angesagt? (19)

Machine Learning in H2O
Machine Learning in H2OMachine Learning in H2O
Machine Learning in H2O
 
Oshs_9_11_2015
Oshs_9_11_2015Oshs_9_11_2015
Oshs_9_11_2015
 
Another RDF Encoding Form
Another RDF Encoding FormAnother RDF Encoding Form
Another RDF Encoding Form
 
Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...
 
Hub102 - Lesson4 - Data Structure
Hub102 - Lesson4 - Data StructureHub102 - Lesson4 - Data Structure
Hub102 - Lesson4 - Data Structure
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
R Basics and Best Practices
R Basics and Best PracticesR Basics and Best Practices
R Basics and Best Practices
 
Zhishi.me - Weaving Chinese Linking Open Data
Zhishi.me - Weaving Chinese Linking Open DataZhishi.me - Weaving Chinese Linking Open Data
Zhishi.me - Weaving Chinese Linking Open Data
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench
 
Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
VALA Tech Camp 2017: Intro to Wikidata & SPARQLVALA Tech Camp 2017: Intro to Wikidata & SPARQL
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 
Machine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF DataMachine Learning Methods for Analysing and Linking RDF Data
Machine Learning Methods for Analysing and Linking RDF Data
 
Session 17 - Collections - Lists, Sets
Session 17 - Collections - Lists, SetsSession 17 - Collections - Lists, Sets
Session 17 - Collections - Lists, Sets
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 

Andere mochten auch

A Graph-based Approach to Learn Semantic Descriptions of Data Sources
A Graph-based Approach to Learn Semantic Descriptions of Data SourcesA Graph-based Approach to Learn Semantic Descriptions of Data Sources
A Graph-based Approach to Learn Semantic Descriptions of Data Sources
Pedro Szekely
 

Andere mochten auch (11)

A Graph-based Approach to Learn Semantic Descriptions of Data Sources
A Graph-based Approach to Learn Semantic Descriptions of Data SourcesA Graph-based Approach to Learn Semantic Descriptions of Data Sources
A Graph-based Approach to Learn Semantic Descriptions of Data Sources
 
Connecting the Smithsonian American Art Museum to the Linked Data Cloud
Connecting the Smithsonian American Art Museum to the Linked Data CloudConnecting the Smithsonian American Art Museum to the Linked Data Cloud
Connecting the Smithsonian American Art Museum to the Linked Data Cloud
 
A Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesA Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured Sources
 
Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and Analysis
 
Stalin mosqueratcolor1
Stalin mosqueratcolor1Stalin mosqueratcolor1
Stalin mosqueratcolor1
 
Ligña katherine-t-color-1
Ligña katherine-t-color-1Ligña katherine-t-color-1
Ligña katherine-t-color-1
 
Project Essay
Project EssayProject Essay
Project Essay
 
Halloween
HalloweenHalloween
Halloween
 
El color diseño
El color  diseñoEl color  diseño
El color diseño
 
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...A Semantic Approach to Retrieving, Linking, and  Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
 
Tecnica de latarjet, inestabilidad de hombro, luxación de hombro
Tecnica de latarjet, inestabilidad de hombro, luxación de hombroTecnica de latarjet, inestabilidad de hombro, luxación de hombro
Tecnica de latarjet, inestabilidad de hombro, luxación de hombro
 

Ähnlich wie Discovering Alignments in Ontologies of Linked Data

20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
Seonho Kim
 
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
NASIG
 
Tue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddataTue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddata
eswcsummerschool
 
How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
Amit Sheth
 

Ähnlich wie Discovering Alignments in Ontologies of Linked Data (20)

The Web of Linked Data and its information
The Web of Linked Data and its informationThe Web of Linked Data and its information
The Web of Linked Data and its information
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
ELLIS: Interactive Exploration of Linked Data on the Level of Induced Schema ...
ELLIS: Interactive Exploration of Linked Data on the Level of Induced Schema ...ELLIS: Interactive Exploration of Linked Data on the Level of Induced Schema ...
ELLIS: Interactive Exploration of Linked Data on the Level of Induced Schema ...
 
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
 
MIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptxMIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptx
 
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
 
Learn about Your Location (Using ALL Your Data)
Learn about Your Location (Using ALL Your Data)Learn about Your Location (Using ALL Your Data)
Learn about Your Location (Using ALL Your Data)
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
semantic web & natural language
semantic web & natural languagesemantic web & natural language
semantic web & natural language
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RML
 
Semantic web final assignment
Semantic web final assignmentSemantic web final assignment
Semantic web final assignment
 
Tue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddataTue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddata
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs
 
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
 
How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
 
How To Make Linked Data More than Data
How To Make Linked Data More than DataHow To Make Linked Data More than Data
How To Make Linked Data More than Data
 

Mehr von Craig Knoblock

Mehr von Craig Knoblock (9)

Learning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and FailuresLearning to Adapt to Sensor Changes and Failures
Learning to Adapt to Sensor Changes and Failures
 
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
From Artwork to Cyber Attacks: Lessons Learned in Building Knowledge Graphs u...
 
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Or...
 
Lessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art CollaborativeLessons Learned in Building Linked Data for the American Art Collaborative
Lessons Learned in Building Linked Data for the American Art Collaborative
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sources
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human Trafficking
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 

Kürzlich hochgeladen (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

Discovering Alignments in Ontologies of Linked Data

  • 1. Discovering Concept Coverings in Ontologies of Linked Data Sources Best Paper: Research Track – The 11th International Semantic Web Conference, 2012 Rahul Parundekar, Craig A. Knoblock and Jose-Luis Ambite {parundek,knoblock}@usc.edu, ambite@isi.edu University of Southern California
  • 3. Data Integration on the Web Goal: Integrate and Query Data across the Web Two challenges in data integration: • Object-level (aka record linkage): When objects at different sites are the same real-world object • Linked Data movement addresses object-level integration • objects have unique URIs, • equivalent objects linked with owl:sameAs statements • Schema-level (aka schema/ontology mapping): align the semantics of different sources • Define mappings to relate source schemas to common schema • Use schema mappings to query across multiple sources Our work: Exploit linked data to automatically discover expressive schema/ontology mappings
  • 4. The Web of Linked Data integrates data at the object level Example: Geospatial Domain Los Angeles City of Los Angeles
  • 5. Equivalent instances in different sources are connected with owl:sameAs Source 1 Source 2 Ontology Level Populated Place City Instance Level owl:sameAs Los Angeles City of Los Angeles
  • 6. Links are absent at the ontology level Source 1 Source 2 Ontology Level Populated Place Only 15 out of the NO LINKS!! 190 ontologies are connected City Instance Level owl:sameAs Los Angeles City of Los Angeles
  • 7. Alignments are necessary for interoperability of the sources Source 1 Source 2 Ontology Level Populated Place We need to find NO LINKS!! links at the Ontology Level City Instance Level owl:sameAs Los Angeles City of Los Angeles
  • 8. How can we find ontology alignments? Source 1 Source 2 Ontology Level = Populated Place City Instance Level owl:sameAs Los Angeles City of Los Angeles
  • 10. Use an extensional approach to align concepts Represents set of instances belonging to ClassA Represents set of instances belonging to ClassB ClassA is disjoint from ClassB ClassA is equivalent to ClassB ClassA is subset of ClassB ClassB is subset of ClassA
  • 11. Align concepts when supported by evidence at the instance level Source 1 Source 2 Ontology Level = Populated Place City Instance Level New York NYC City of Los Angeles Los Angeles City of Dublin Dublin Kuala Lumpur … and more Kolumpo … and more
  • 12. However, ontologies of many sources are rudimentary DBpedia Ontology Description GeoNames Ontology # of Properties 3.77 million 8 million 9 feature classes, 645 feature codes 359 1 (Well-definedhierarchy) # ofClasses Geographical Database For Example: Places, People, Music, Topics, etc. # of Instances Semantic Web version of Wikipedia (rdf:type=Feature) 1775 29 Rich, Descriptive Ontology Impoverished Ontology Finding Alignments is Non-Trivial
  • 13. Create NEW concepts by restricting values of properties Set of all instances in GeoNames Set of all instances with featureClass=P Atomic Restriction Classes* Set of all instances in DBpedia Set of all instances with rdf:type=PopulatedPlace * Value Restrictions in OWL-DL
  • 14. Comparing Linked Instances Across Ontologies | Img(r1) ∩ r2| P | Img(r1) | R | Img(r1) ∩ r2| |r2|
  • 15. Align Atomic Restriction Classes by comparing their overlap featureClass=P rdf:type=PopulatedPlace = r1 r2 | Img(r1) ∩ r2| | Img(r1) | = > 0.9 1 | Img(r1) ∩ r2| |r2| = > 0.9 1
  • 16. Create specialized concepts using conjunctive restriction classes For Example: Creating the concept for “Schools in the US” Set of all instances in GeoNames Set of all instances with countryCode=US i.e. features in the US Set of all instances with featureCode=S.SCH &countryCode=US i.e. Schools in the US Set of all instances with featureCode=S.SCH i.e. Schools Conjunctive Restriction Classes
  • 17. An ordered top-down exploration algorithm to align Atomic & Conjunctive Restriction Classes
  • 18. Detect rich alignments even when ontologies are rudimentary GeoNames-Dbpedia Relationship Equivalent Subset # Alignments Found with Atomic and Conjunctive Restriction Classes 31 2193 Can we find more meaningful alignments?
  • 19. CONSIDER THREE OF THE SUBSET RELATIONS PRODUCED…
  • 20. 1) Schools in GeoNames are Educational Institutions in DBpedia featureCode=S.SCH rdf:type=EducationalInstitution
  • 21. 2) Colleges in GeoNames are Educational Institutions in DBpedia featureCode=S.SCH rdf:type=EducationalInstitution featureCode=S.SCHC
  • 22. 3) Universities in GeoNames are Educational Institutions in DBpedia featureCode=S.SCH rdf:type=EducationalInstitution featureCode=S.SCHC featureCode=S.UNIV
  • 23. Using featureCode property as a hint, create a Union of concepts featureCode=S.SCH rdf:type=EducationalInstitution featureCode=S.SCHC featureCode=S.UNIV featureCode=S.SCHC ∩ ∩ featureCode=S.SCH featureCode=S.UNIV
  • 24. Detect a Concept Covering by extensional comparison featureCode=S.SCH rdf:type=EducationalInstitution featureCode=S.SCHC = featureCode=S.UNIV featureCode=S.SCHC ∩ ∩ featureCode=S.SCH featureCode=S.UNIV
  • 25. Compare the overlap of the extension sets to determine equivalence featureCode={S.SCH, S.SCHC, S.UNIV} US rdf:type=EducationalInstitution UL |UL| = 404 Educational Institutions UA=US∩UL = | UA | | US | > = 1, by definition 0.9, by definition | UA | | UL | = =1 396 404 = 0.98 > 0.9
  • 27. Example: Am I in Spain … or Italy? • We align dbpedia:country=dbpedia:Spain with geonames:countryCode=ES • 3917 out of 3918 instances in GeoNames agree with this • ONE instance had its country code as Italy. • Because this instance contradicts overwhelming evidence, we can flag it as an outlier
  • 28. Concept Covering of Educational Institutions: What are the other 8 instances? featureCode={S.SCH, S.SCHC, S.UNIV} | UA | | UL | rdf:type=EducationalInstitution = 396 404 • 1 with featureCode=S.HSP (Hospitals) • There are 31 instances with S.HSP because of which Hospitals are not subsets • • • • • 3 with featureCode=S.BLDG (Buildings) 1 with featureCode=S.EST (Establishment) 1 with featureCode=S.LIBR (Library) 1 with featureCode=S.MUS (Museum) 1 doesn’t have a featureCodeproperty
  • 30. Example alignments of Atomic Restriction Classes Restriction Class from GeoNames Restriction Class from DBpedia Rel P R | Img(r1) ∩ r2| Alignments with concepts that were not explicit– e.g. Concept ofPlaces featureClass=P rdf:type=PopulatedPlace = 99. 6 90.5 70658 ∩ Alignments with geographical regions like countries, administrative divisions, etc. countryCode=ES country=Spain = 94. 5 99.9 3917 Find the actual relationship between concepts as opposed to the perceived one
  • 31. Example alignments of Conjunctive Restriction Classes Restriction Class from GeoNames Restriction Class from DBpedia Rel P R | Img(r1) ∩ r2| Find alignments with conjunctiverestriction classes e.g. Concepts of ‘Places in the US’ are equal featureClass=P &countrycode=US rdf:type=PopulatedPlace& country=United_States = 97. 2 96.7 26061 Find alignments with conjunctive restriction classes that have related properties e.g. Places in North Dakota have 701 area code for phone numbers featureClass=P & parentADM1= North_Dakota areaCode=701 = 98. 1 96.5 361 In some cases the meaning of a concept shifts slightly e.g. Populated Places in Senegal are aligned to Towns rather than PopulatedPlaces featureClass=P &countryCode=SN rdf:type=Town & country=Senegal = 92. 6 100 25
  • 32. Example Alignments of Disjunctive Restriction Classes Larger Restriction Class Union of Smaller Restriction Classes Rel R Ove Outliers rlap Find concept coverings with disjunctive restriction classes– e.g. Educational Institution concept in Dbpedia covers concepts of Schools, Colleges and Universities rdf:type= dbpedia: geonames:featureCode= Educational_Institution {S.SCH, S.SCHC, S.UNIV} = 98. 0 396/ 404 S.BLDG, S.HSP, S.MUS, etc. = 99. 2 1981/ 1996 S.AIRF, S.FRMT, S.SCH, T.HLL, etc. 98. 0 1939/ 1978 dbpedia: Kingdom_o f_the_Neth erlands System can flag outliers that may need to be corrected rdf:type= dbpedia:Airport Geonames:featureCode= {S.AIRB, S.AIRP} System is able to find all terms used for the country Netherlands geonames: countryCode=NL dbpedia:country= {dbpedia:The_Netherlands,db pedia:Flag_of_the_Netherland s.svg, dbpedia:Netherlands} =
  • 33. Related Work • Other Ontology Alignment efforts in the Web of Linked Data • BLOOMS, BLOOMS+ [Jain et al. ISWC 2010, 2011] • Linked Open Data ontologies aligned with central ontology called ‘Proton’ using structural similarity • Agreement Maker [Cruz et al. 2011] • Similarity Metrics on labels of classes • Statistical schema induction [Volker et al. ISWC 2011] • Mines associativity rules from intermediate ‘transaction data sets’ -> OWL2 Axioms. • Formalization of Ontology Mappings [Atencia et al. ISWC 2012] • A related work that provides a formalization of weighted ontology mappings
  • 34. Conclusion and Future Work • Conclusion • Our approach is able to find alignments • Automatically, across any two linked sources • Even in the case of a rudimentary ontology • Types of Alignments • Atomic Restriction Classes • Conjunctive Restriction Classes • Disjunctive Restriction Classes i.e. Concept Coverings • And detect Outliers that help identify inconsistencies in the data • Future work • Add support for negation • Build complete descriptions of sources • Use algorithms to negotiate meaning between agents on the fly
  • 35. References for Additional Detail: Any questions? THANK YOU Rahul Parundekar, Craig A. Knoblock, and Jose Luis Ambite. Linking and building ontologies of linked data. The Semantic Web, ISWC 2010. Rahul Parundekar, Craig A. Knoblock, and Jose Luis Ambite. Discovering concept coverings in ontologies of linked data sources The Semantic Web, ISWC 2012. Rahul Parundekar, Craig A. Knoblock, and Jose Luis Ambite. Discovering alignments in ontologies of linked data IJCAI-2013

Hinweis der Redaktion

  1. Instances are linked across multiple sourcesEquivalent instances in the different domains connected with owl:sameAsDifferent sources with different schemas
  2. We need to find links at the Ontology Level
  3. Replacing with conjunctive
  4. Put slide before 25
  5. Put short citations in
  6. Conclusion should contain conjunctive and disjunctiveCreate an impactful conclusion. - We are able to create ontology where no class exists.