SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
KIT – University of the State of Baden-Wuerttemberg and
National Research Center of the Helmholtz Association
INSTITUTE FOR APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS
www.kit.edu
Deriving Human-Readable Labels
from SPARQL Queries
Basil Ell, Denny Vrandečić, and Elena Simperl
7th International Conference on Semantic Systems, Graz
7 September 2011
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
2 31.03.2014 Basil Ell – Deriving Human-Readable Labels from SPARQL queries
Outline
Motivation
Human-readability of the LOD cloud
Method
Evaluation
Conclusions
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
3 31.03.2014
Introduction
Entities are identified by URIs, such as
http://de.dbpedia.org/resource/Graz
http://rdf.freebase.com/ns/m.043j22x
Human-readable names can be provided e.g.
using the property rdfs:label
dbpedia:Austria
rdfs:label
"Österreich"@de
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
4 31.03.2014
Motivation – Why are labels necessary?
Scenario: linked data browsing
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
[SIGMA]
Is this
meaningful to
human users?
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
5 31.03.2014
Human-Readability of the LOD Cloud
BTC2010 Corpus [BTC2010]
3,167,799,445 ntriples
159,177,123 distinct subjects
137,156,213 (86.17%) have no value for any of the
properties rdfs:label, rdfs:comment,
dc:title, and foaf:name.
61.8% of the analyzed non-information resources have
no label (regarding 36 labeling properties) [Ell et al. 2011]
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
6 31.03.2014
Main Idea
Can we automatically derive labels for entities by
analyzing SPARQL queries?
station can be used as a label for
http://dbpedia.org/ontology/RadioStation
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?stationWHERE {
?station rdf:type dbo:RadioStation
}
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
7 31.03.2014
Analyzed data set
USEWOD2011 corpus[USEWOD2011]
Contains log files from DBpedia and SWDF
distinct parsable SPARQL SELECT queries:
1,212,932 (DBpedia)
195,641 (SWDF)
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
Semantic Web Dog Food
(SWDF)
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
8 31.03.2014
Classification of variable names
Class Description
short String length up to 2 chars. Common: s, p, o, x.
stop Known no-short strings that cannot be used as labels, e.g. subject,
instance, uri.
lang A no-stop string that belongs to a natural language or that consists of
separatedwords of a natural language, e.g. Artist and RadioStation.
Checkedfor the languages {de, en, es, fr, it} using the [Corpex]
webservice.
(The Corpex dataset consists of all words and their frequencies as
extractedand counted from instances of Wikipedia in multiple
languages. [Vrandecic et al. 2011])
nolang Variable names that are neither short, nor stop, nor lang.
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
9 31.03.2014
Classification of triple patterns
Triple pattern classes P = {RRV, RVR, VRL, ...}
R is a resource, V is a variable, L is a literal
Ignoring features such as UNION, OPTIONAL etc.
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
SELECT ... WHERE {
...
dbpedia:Karlsruhe dbo:populationTotal ?population .
...
}
RRV pattern
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
10 31.03.2014
Classification of triple patterns (2)
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
DBpedia
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
11 31.03.2014
DBpedia – top query patterns
(pruned n >= 5000)
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
8312 queries
consist of one
VVL triple and
three VRV triples
Graph pattern classes
visualized as hypergraph:
n Number of
instances
TP Name of
triple pattern
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
12 31.03.2014
SWDF – top query patterns
pruned (n >= 1000)
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
Graph pattern classes
visualized as hypergraph:
n Number of
instances
TP Name of
triple pattern
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
13 31.03.2014
Derivation pattern 1: 1 x RRV
(31.75% of all DBpedia queries)
Assumption: V‘ is a human-readable label for
property R2 iff local_name(R2) = V and lang(V).
V‘ can be derived from V by substituting
separators and splitting camel-cased words into
constituents.
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
<http://dbpedia.org/page/NASA> R1
<http://dbpedia.org/property/agencyName> R2
?agencyName V
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
14 31.03.2014
Derivation pattern 2: Any graph with VRR
(22.32% of all DBpedia queries)
Assumption: V‘ is a human-readable label for
class R2 iff lang(V) and R1 = rdf:type
Example:
?place rdf:type dbo:Location
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
?paper V
<http://data.semanticweb.org/ns/swc/ontology#isPartOf> R1
<http://data.semanticweb.org/conference/www/2009/proceedings> R2
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
15 31.03.2014
Evaluation – 1 x RRV
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
1,366,363 triples of class RRV
549,093 cases: local_name(R2) = V
817,269 cases: local_name(R2) ≠ V
226 pairs (URI, guessed label)
54.5% correct: sufficiently similar to existing labels
14% correct: manual evaluation
9.1% correct within a given context (location for dbo:residence)
22.4% wrong (containedfor dbprop:creator)
68%
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
16 31.03.2014
Evaluation – Any graph with VRR
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
80,455 triples of class RRV
549,093 cases: local_name(R2) = V
60 distinct URIs, 36 labels
25% correct: sufficiently similar to existing labels
39.975% correct: manual evaluation
35.025% wrong (scientist for dbo:SoccerPlayer)
64.975%
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
17 31.03.2014
Conclusions
Approach for automatically deriving labels
Acceptable precision: most derived labels
matched the already existing labels (atypical
datasets)
Derived variable names less specific
Derived labels for terminological entities
(properties and classes), not for instances.
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
18 31.03.2014
References & Acknowledgements
[BTC 2010]http://km.aifb.kit.edu/projects/btc-2010/
[Ell et al. 2011] Labels in the Web of Data, ISWC2011, to appear.
[SIGMA] http://sig.ma/search?q=Sidney+Bechet
[USEWOD2011] http://data.semanticweb.org/usewod/2011/challenge.html
[Corpex] http://km.aifb.kit.edu/sites/corpex/
[Vrandecic et al. 2011]
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
Part of this work has been carriedout in the framework of the German Research
Foundation (DFG) project entitled: "Entwicklung einer Virtuellen Forschungs-
umgebung für die Historische Bildungsforschung mit Semantischer Wiki-Techno-
logie - Semantic MediaWiki for Collaborative CorporaAnalysis"
(INST 5580/1-1), in the domain of "Scientific Library Services and Information
Systems" (LIS).
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
19 31.03.2014
THANK YOU FOR YOUR ATTENTION
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
20 31.03.2014
BACKUP SLIDES
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
21 31.03.2014
Triple pattern classes (SWDF)
Basil Ell – Deriving Human-Readable Labels from SPARQL queries
KIT – Karlsruhe Institute of Technology
Institute for Applied Informatics and Formal Description Methods
22 31.03.2014 Basil Ell – Deriving Human-Readable Labels from SPARQL queries

Weitere ähnliche Inhalte

Was ist angesagt?

3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
Positional Data Organization and Compression in Web Inverted Indexes
Positional Data Organization and Compression in Web Inverted IndexesPositional Data Organization and Compression in Web Inverted Indexes
Positional Data Organization and Compression in Web Inverted IndexesLeonidas Akritidis
 
5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patilwidespreadpromotion
 
Stacks in algorithems & data structure
Stacks in algorithems & data structureStacks in algorithems & data structure
Stacks in algorithems & data structurefaran nawaz
 

Was ist angesagt? (8)

3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil
 
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil1. Fundamental Concept - Data Structures using C++ by Varsha Patil
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
 
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil9. Searching & Sorting - Data Structures using C++ by Varsha Patil
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
 
Chado introduction
Chado introductionChado introduction
Chado introduction
 
Positional Data Organization and Compression in Web Inverted Indexes
Positional Data Organization and Compression in Web Inverted IndexesPositional Data Organization and Compression in Web Inverted Indexes
Positional Data Organization and Compression in Web Inverted Indexes
 
5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil
 
Chado-XML
Chado-XMLChado-XML
Chado-XML
 
Stacks in algorithems & data structure
Stacks in algorithems & data structureStacks in algorithems & data structure
Stacks in algorithems & data structure
 

Ähnlich wie Deriving human readable labels from sparql queries

Labels in the web of data
Labels in the web of dataLabels in the web of data
Labels in the web of dataBasil Ell
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paperDBOnto
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paperDBOnto
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyAlbert Meroño-Peñuela
 
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesSPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesBasil Ell
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Beat Signer
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositoriesandrea huang
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference
 
Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...PeterWinstanley1
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resourcesimranlatif
 
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaStuart Chalk
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentManjulaPatel
 
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Bradley Allen
 
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 
SPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesSPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesBasil Ell
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 

Ähnlich wie Deriving human readable labels from sparql queries (20)

Labels in the web of data
Labels in the web of dataLabels in the web of data
Labels in the web of data
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic Study
 
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesSPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 
Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...Interconnecting Belgian national and regional address data using EC ISA "Loca...
Interconnecting Belgian national and regional address data using EC ISA "Loca...
 
Searching Heterogenous E Learning Resources
Searching Heterogenous E Learning ResourcesSearching Heterogenous E Learning Resources
Searching Heterogenous E Learning Resources
 
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents Environment
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)Multimedia Data Navigation and the Semantic Web (SemTech 2006)
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
 
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
SPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesSPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queries
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 

Kürzlich hochgeladen

定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 

Kürzlich hochgeladen (20)

定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 

Deriving human readable labels from sparql queries

  • 1. KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association INSTITUTE FOR APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS www.kit.edu Deriving Human-Readable Labels from SPARQL Queries Basil Ell, Denny Vrandečić, and Elena Simperl 7th International Conference on Semantic Systems, Graz 7 September 2011
  • 2. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 2 31.03.2014 Basil Ell – Deriving Human-Readable Labels from SPARQL queries Outline Motivation Human-readability of the LOD cloud Method Evaluation Conclusions
  • 3. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 3 31.03.2014 Introduction Entities are identified by URIs, such as http://de.dbpedia.org/resource/Graz http://rdf.freebase.com/ns/m.043j22x Human-readable names can be provided e.g. using the property rdfs:label dbpedia:Austria rdfs:label "Österreich"@de Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 4. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 4 31.03.2014 Motivation – Why are labels necessary? Scenario: linked data browsing Basil Ell – Deriving Human-Readable Labels from SPARQL queries [SIGMA] Is this meaningful to human users?
  • 5. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 5 31.03.2014 Human-Readability of the LOD Cloud BTC2010 Corpus [BTC2010] 3,167,799,445 ntriples 159,177,123 distinct subjects 137,156,213 (86.17%) have no value for any of the properties rdfs:label, rdfs:comment, dc:title, and foaf:name. 61.8% of the analyzed non-information resources have no label (regarding 36 labeling properties) [Ell et al. 2011] Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 6. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 6 31.03.2014 Main Idea Can we automatically derive labels for entities by analyzing SPARQL queries? station can be used as a label for http://dbpedia.org/ontology/RadioStation Basil Ell – Deriving Human-Readable Labels from SPARQL queries PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?stationWHERE { ?station rdf:type dbo:RadioStation }
  • 7. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 7 31.03.2014 Analyzed data set USEWOD2011 corpus[USEWOD2011] Contains log files from DBpedia and SWDF distinct parsable SPARQL SELECT queries: 1,212,932 (DBpedia) 195,641 (SWDF) Basil Ell – Deriving Human-Readable Labels from SPARQL queries Semantic Web Dog Food (SWDF)
  • 8. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 8 31.03.2014 Classification of variable names Class Description short String length up to 2 chars. Common: s, p, o, x. stop Known no-short strings that cannot be used as labels, e.g. subject, instance, uri. lang A no-stop string that belongs to a natural language or that consists of separatedwords of a natural language, e.g. Artist and RadioStation. Checkedfor the languages {de, en, es, fr, it} using the [Corpex] webservice. (The Corpex dataset consists of all words and their frequencies as extractedand counted from instances of Wikipedia in multiple languages. [Vrandecic et al. 2011]) nolang Variable names that are neither short, nor stop, nor lang. Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 9. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 9 31.03.2014 Classification of triple patterns Triple pattern classes P = {RRV, RVR, VRL, ...} R is a resource, V is a variable, L is a literal Ignoring features such as UNION, OPTIONAL etc. Basil Ell – Deriving Human-Readable Labels from SPARQL queries SELECT ... WHERE { ... dbpedia:Karlsruhe dbo:populationTotal ?population . ... } RRV pattern
  • 10. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 10 31.03.2014 Classification of triple patterns (2) Basil Ell – Deriving Human-Readable Labels from SPARQL queries DBpedia
  • 11. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 11 31.03.2014 DBpedia – top query patterns (pruned n >= 5000) Basil Ell – Deriving Human-Readable Labels from SPARQL queries 8312 queries consist of one VVL triple and three VRV triples Graph pattern classes visualized as hypergraph: n Number of instances TP Name of triple pattern
  • 12. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 12 31.03.2014 SWDF – top query patterns pruned (n >= 1000) Basil Ell – Deriving Human-Readable Labels from SPARQL queries Graph pattern classes visualized as hypergraph: n Number of instances TP Name of triple pattern
  • 13. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 13 31.03.2014 Derivation pattern 1: 1 x RRV (31.75% of all DBpedia queries) Assumption: V‘ is a human-readable label for property R2 iff local_name(R2) = V and lang(V). V‘ can be derived from V by substituting separators and splitting camel-cased words into constituents. Basil Ell – Deriving Human-Readable Labels from SPARQL queries <http://dbpedia.org/page/NASA> R1 <http://dbpedia.org/property/agencyName> R2 ?agencyName V
  • 14. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 14 31.03.2014 Derivation pattern 2: Any graph with VRR (22.32% of all DBpedia queries) Assumption: V‘ is a human-readable label for class R2 iff lang(V) and R1 = rdf:type Example: ?place rdf:type dbo:Location Basil Ell – Deriving Human-Readable Labels from SPARQL queries ?paper V <http://data.semanticweb.org/ns/swc/ontology#isPartOf> R1 <http://data.semanticweb.org/conference/www/2009/proceedings> R2
  • 15. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 15 31.03.2014 Evaluation – 1 x RRV Basil Ell – Deriving Human-Readable Labels from SPARQL queries 1,366,363 triples of class RRV 549,093 cases: local_name(R2) = V 817,269 cases: local_name(R2) ≠ V 226 pairs (URI, guessed label) 54.5% correct: sufficiently similar to existing labels 14% correct: manual evaluation 9.1% correct within a given context (location for dbo:residence) 22.4% wrong (containedfor dbprop:creator) 68%
  • 16. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 16 31.03.2014 Evaluation – Any graph with VRR Basil Ell – Deriving Human-Readable Labels from SPARQL queries 80,455 triples of class RRV 549,093 cases: local_name(R2) = V 60 distinct URIs, 36 labels 25% correct: sufficiently similar to existing labels 39.975% correct: manual evaluation 35.025% wrong (scientist for dbo:SoccerPlayer) 64.975%
  • 17. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 17 31.03.2014 Conclusions Approach for automatically deriving labels Acceptable precision: most derived labels matched the already existing labels (atypical datasets) Derived variable names less specific Derived labels for terminological entities (properties and classes), not for instances. Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 18. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 18 31.03.2014 References & Acknowledgements [BTC 2010]http://km.aifb.kit.edu/projects/btc-2010/ [Ell et al. 2011] Labels in the Web of Data, ISWC2011, to appear. [SIGMA] http://sig.ma/search?q=Sidney+Bechet [USEWOD2011] http://data.semanticweb.org/usewod/2011/challenge.html [Corpex] http://km.aifb.kit.edu/sites/corpex/ [Vrandecic et al. 2011] Basil Ell – Deriving Human-Readable Labels from SPARQL queries Part of this work has been carriedout in the framework of the German Research Foundation (DFG) project entitled: "Entwicklung einer Virtuellen Forschungs- umgebung für die Historische Bildungsforschung mit Semantischer Wiki-Techno- logie - Semantic MediaWiki for Collaborative CorporaAnalysis" (INST 5580/1-1), in the domain of "Scientific Library Services and Information Systems" (LIS).
  • 19. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 19 31.03.2014 THANK YOU FOR YOUR ATTENTION Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 20. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 20 31.03.2014 BACKUP SLIDES Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 21. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 21 31.03.2014 Triple pattern classes (SWDF) Basil Ell – Deriving Human-Readable Labels from SPARQL queries
  • 22. KIT – Karlsruhe Institute of Technology Institute for Applied Informatics and Formal Description Methods 22 31.03.2014 Basil Ell – Deriving Human-Readable Labels from SPARQL queries