SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Downloaden Sie, um offline zu lesen
DBpedia Tutorial 09.02.2015 http://dbpedia.org1
Creating Knowledge out of Interlinked Data
Markus Ackermann, Markus Freudenberg
WG Agile Knowledge and Semantic Web
Universität Leipzig
DBpedia
Extraction of Knowledge
from Wikipedia
DBpedia Tutorial 09.02.2015 http://dbpedia.org2
Wikipedia
Wikipedia coverage of the London bombing on July 7, 2005
–the first Wikipedia entry appeared in just 18 minutes
–2500 users provided a 14 page article in only 12 hours
–far more detailed than any other news source
[Tapscott, D. Williams 2006]
DBpedia Tutorial 09.02.2015 http://dbpedia.org3
Wikipedia
Wikipedia articles:
–4,7 mio. Articles; 780 article additions per day
–are highly topical
–containing only few errors, which can easily be
revised
–cover often very specific content
→ Wikipedia is the knowledge
compendium of humanity.
DBpedia Tutorial 09.02.2015 http://dbpedia.org4
Semantic Web
–Web 3.0 web technology
–a way of linking data between systems or entities
–allows for rich, self-describing interrelations of data
available across the globe
–open up the web of data to artificial intelligence
processes
–encourage companies, organisations and individuals to
publish their data freely, in an open standard format
–encourage businesses to use data already available on
the web (data give/take)
DBpedia Tutorial 09.02.2015 http://dbpedia.org5
Linked Data
The means of populating the Semantic Web is Linked Data.
(introduced by Tim Berners-Lee)
Four simple rules :
–Use URIs as names for things
–Use HTTP URIs so that people can look up those names
–When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
–Include links to other URIs. so that they can discover
more things.
DBpedia Tutorial 09.02.2015 http://dbpedia.org6
5 ★ Linked Open Data
DBpedia Tutorial 09.02.2015 http://dbpedia.org7
benefits of using Linked Data
Consumer View
- link data from any other place in the web
- discover more related data while consuming
data
- reuse parts of the data
- reuse existing tools and libraries
- combine data safely with other data
- query data over different repositories
Publisher View
- make your data discoverable
- increase the value of your data (by linking it)
- have fine-granular control over the
data items and optimise their access
- design data to fit your domain knowledge
DBpedia Tutorial 09.02.2015 http://dbpedia.org8
What's DBpedia?
– DBpedia is a community effort to extract structured
information from Wikipedia and to make this information
available on the Web.
– DBpedia allows you to ask sophisticated queries against
Wikipedia, and to link other data sets on the Web to Wikipedia
data.
– Common goal with WikiData but, different approach
DBpedia Tutorial 09.02.2015 http://dbpedia.org9
What's DBpedia?
–DBpedia project was started in 2006
–has been a key factor for the success of the
Linked Open Data initiative
– serves as an interlinking hub for other data
sets
–DBpedia provides a testbed serving real data
spanning various domains
–In more than 120 language editions
DBpedia Tutorial 09.02.2015 http://dbpedia.org10
Where is Wikipedia
information useful?
„Which films starred John Cleese without any
other members of Monty Python?“
„What have Dublin and Leipzig in common?“ 
„Which Software products are developed by an
organisation founded in California?“
„Which populated places in Germany are below
sea level?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org11
Where is Wikipedia
information useful?
●
as terminology and concept repository and fact source
for Entity Linking and Disambiguation:
The series follows the adventures of a space-faring crew on board
the starship USS Enterprise (NCC-1701-D), the fifth Federation
vessel to bear the name and registry and the seventh starship by
that name
The Enterprise is commanded by Captain Jean-Luc Picard and is
staffed by first officer Commander William Riker, operations
manager Data, security chief Tasha Yar, ship's counselor
Deanna Troi, chief medical officer Dr. Beverly Crusher, conn
officer Lieutenant Geordi La Forge, and junior officer Lieutenant
Worf.
⇒ no company, no aircraft carrier, no satellite
⇒ correlate the mentionings and concept starship
⇒ Star Trek rank, contemporary or past military or
law enforcement
DBpedia Tutorial 09.02.2015 http://dbpedia.org12
Why search engines aren't
always enough
„Which films
starred John
Cleese without
any other
members of
Monty
Python?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org13
DBpedia Tutorial 09.02.2015 http://dbpedia.org14
DBpedia Tutorial 09.02.2015 http://dbpedia.org15
What is needed to do better?
●
ontological represantation of entities and facts
„An ontology is a specification of a conceptualization.“
(Gruber, 1993)
⇒ formal description of concepts and relationships
DBpedia Tutorial 09.02.2015 http://dbpedia.org16
What is needed to do better?
●
ontological represantation of entities and facts
●
well-defined taxonomy of entity types
●
assertions about entities in and their relations
A British Comedy is a kind of Comedy. A Comedy is a kind
of Film.
A British Comedy is a kind of Film.
Clockwise is a British Comedy. John Cleese stars
Clockwise.
John Cleese stars a Film.
●
thoroughly specified, machine-actionable, but flexible
formalism for representation
DBpedia Tutorial 09.02.2015 http://dbpedia.org17
A brief introduction to RDF
Resource Description Framework (W3C Standard)
●
flexible language and data model for representation of
information
●
based on (S,P,O) triples denoting simple assertions
S – subject P – property O – object
S   I∊ ∪B P   ∊ I O   ∊ I∪B∪L
I – URIs/IRIs; B – blank nodes; L – Literals
●
URIs/IRIs of named entities are:
●
unambigious, but non-unique identifiers of a resource
●
often dereferencable (in the Semantic Web)
●
aggregate of triple-assertions constitutes a directed
graph with typed edges
DBpedia Tutorial 09.02.2015 http://dbpedia.org18
A brief introduction to RDF
DBpedia Tutorial 09.02.2015 http://dbpedia.org19
DBpedia -
motivation and use cases
an RDF view of structured Wikipedia information
enables:
●
sophisitated queries
⇒ cross-referencing facts of entities
⇒ filtering of entities based on their types
and fact assertions
●
combining facts from Wikipedia with machine-
actionable knowledge from other structured datasets
(Geodata, Yellowpages, WordNet, ...)
DBpedia Tutorial 09.02.2015 http://dbpedia.org20
Another take on
Question Answering
„Which films
starred John
Cleese without
any other
members of
Monty
Python?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org21
DBpedia Tutorial 09.02.2015 http://dbpedia.org22
DBpedia -
contents and datasets
●
Wikipedia article ⇔ DBpedia resource
http://en.wikipedia.org/wiki/Monty_Python
⇔ http://dbpedia.org/resource/Monty_Python
●
mapping-based types and facts governed by the DBpedia
Ontology
DBpedia Tutorial 09.02.2015 http://dbpedia.org23
DBpedia -
contents and datasets
●
4.58 mio. entities and 583 mio. triples (Englisch DBpedia
2014)
131,2 mio. fact assertions (devived from info boxes)
168,5 mio. triples representing Wikipedia structure
57,1 mio. links to external datasets
●
DBpedia resources are categorised in several manners:
●
by Wikipedia categories (represented in SKOS)
●
by YAGO classification
●
by links to WordNet Synsets
●
by assignment of classes from the DBpedia ontology
●
Provenance meta-data
⇒ From which part of which Wikipedia page was a triple derived?
DBpedia Tutorial 09.02.2015 http://dbpedia.org24
Mappings Wiki
a community effort to:
–develop an ontology schema
–provide mappings from Wikipedia Infoboxes
properties to this ontology
→ creating an alignment between Wikipedia and
Dbpedia
→ eliminating name variations in properties and
classes
→ big boost for Precision
DBpedia Tutorial 09.02.2015 http://dbpedia.org25
DBpedia Ontology
cross-domain ontology
–maintained and extended by the community in the
DBpedia Mappings Wiki
–manually created based on the most commonly used
infoboxes
–currently covers 685 classes which form a subsumption
hierarchy and are described by 2,795 different
properties
–subsumption hierarchy with a maximal depth of 5
–is maintained and extended by the community in the
DBpedia Mappings Wiki
DBpedia Tutorial 09.02.2015 http://dbpedia.org26
Dbpedia Ontology Extract
DBpedia Tutorial 09.02.2015 http://dbpedia.org27
Wikipedia articles
– Wikipedia articles consist mostly of free text
– also comprise various types of structured
information
– including: infobox templates, categorisation
information, images, geo-coordinates, links to
external web pages, disambiguation pages,
redirects between pages, other language links
– Title
– Abstract
– Infoboxes
– Geo-
coordinates
– Categories
– Images
article outline
–Links
»other language
versions
»other Wikipedia pages
»To the Web
»Redirects
»Disambiguations
DBpedia Tutorial 09.02.2015 http://dbpedia.org28
Structure in Wikipedia
Title
Abstract
Infoboxes
Geo-coordinates
Categories
Images
Links
– other language versions
– other Wikipedia pages
– To the Web
– Redirects
– Disambiguations
DBpedia Tutorial 09.02.2015 http://dbpedia.org29
{{Infobox Korean settlement
| title = Busan Metropolitan City
| img = Busan.jpg
| imgcaption = A view of the [[Geumjeong]] district in Busan
| hangul = 부 산 광 역 시
...
| area_km2 = 763.46
| pop = 3635389
| popyear = 2006
| mayor = Hur Nam-sik
| divs = 15 wards (Gu), 1 county (Gun)
| region = [[Yeongnam]]
| dialect = [[Gyeongsang]]
}}
dbp:Busan dbp:title ″Busan Metropolitan City″
dbp:Busan dbp:hangul ″ 부 산 광 역 시 ″ @Hang
dbp:Busan dbp:area_km2 ″763.46“^xsd:float
dbp:Busan dbp:pop ″3635389“^xsd:int
dbp:Busan dbp:region dbp:Yeongnam
dbp:Busan dbp:dialect dbp:Gyeongsang
...
infobox encondig
DBpedia Tutorial 09.02.2015 http://dbpedia.org30
heterogeneiety in infoboxes
DBpedia Tutorial 09.02.2015 http://dbpedia.org31
Björk (Musician)
Occupation = Musician, Actor
Born = 21.12.1965, Reykjavík
Brown (Prime Minister)
office = Prime Minister of the UK
birth_date = 20.4.1951
birth_place = Govan
Romero (Actor)
occupation = Actor, Editor
birthdate = 4.2.1940
birthplace = New York
DBpedia Tutorial 09.02.2015 http://dbpedia.org32
DBpedia Extraction
Framework
DIEF - DBpedia Information Extraction Framework
–extracts structured information from Wikipedia and
turns it into a rich knowledge base
–Mapping-Based Infobox Extraction, Raw Infobox
Extraction, Feature Extraction, Statistical Extraction
–Hosted on GitHub
–Written in Scala & Java
DBpedia Tutorial 09.02.2015 http://dbpedia.org33
DBpedia Tutorial 09.02.2015 http://dbpedia.org34
Dbpedia Live
–Wikipedia articles are continuously revised at a
very
high rate
–English Wikipedia, in June 2013, had
approximately 3.3 million edits per month (^=
77 edits per minute)
–Dbpedia Live was developed to keep Dbpedia
in synchronization with Wikipedia
–works on a continuous stream of updates from
Wikipedia and processes that stream on the fly
DBpedia Tutorial 09.02.2015 http://dbpedia.org35
Need for validation
●
over 3 mio. violation
DBpedia Tutorial 09.02.2015 http://dbpedia.org36
Acessing DBpedia - Browsing
●
official DBpedia mirror http://dbpedia.org
⇒ run on Virtuoso
⇒ point & click browsing via DBpedia VAD
⇒ faceted search with Virtuoso Facets
DBpedia Tutorial 09.02.2015 http://dbpedia.org37
Acessing DBpedia - SPARQL
●
official SPARQL endpoint http://dbpedia.org/sparql
●
⇒ subject to a fair use policy (limited query runtime)
●
⇒ iSPARQL frontend (interactive query building)
●
⇒ Snorql frontend
●
⇒ query with any SPARQL compliant tool or API
DBpedia Tutorial 09.02.2015 http://dbpedia.org38
Querying RDF with SPARQL
●
SPARQL Protocol and RDF Query Language
⇒ graph patterns as set of triples (with variables)
⇒ successful matches of graph patters generate
bindings in (sub-)query solutions
DBpedia Tutorial 09.02.2015 http://dbpedia.org39
Querying RDF with SPARQL
●
SPARQL Protocol and RDF Query Language
⇒ graph patterns as set of triples (with variables)
⇒ successful matches of graph patters generate
bindings in (sub-)query solutions
●
different result types for queries
SELECT ⇒ bindings, ASK ⇒ true/false, CONSTRUCT ⇒ new graph
●
combinators and modifiers for basic graph patterns
⇒ UNION, FILTER, MINUS, FILTER (NOT) EXISTS
●
result set modifies
LIMIT, OFFSET, DISTINCT, ORDER BY
●
numerous operators and operators for resource and
literal values
●
many additions in 1.1 revision:
grouping & aggregates, regular property path expr., sub-queries
DBpedia Tutorial 09.02.2015 http://dbpedia.org40
SPARQL Query Example
DBpedia Tutorial 09.02.2015 http://dbpedia.org41
SPARQL Tooling
●
FlintSparqlEditor: Javascript SPARQL Editor
●
syntax highlighting, code assistance
●
auto-completion for properties and classes (for small
datasets)
●
Protegé: full-fledged ontology editor
●
good to get an overview of ontologies backing datasets
●
two SPARQL plug-ins (one supporting entailment)
●
curl or your favourite simple REST API
●
allows for simple testing queries from any text editor with
SPARQL syntax support (e.g. Emacs, Vim, Sublime Text)
$curl -H 'Accept: application/json' --data-urlencode
"query=$(cat query.sparql)" http://dbpedia.org/sparql
DBpedia Tutorial 09.02.2015 http://dbpedia.org42
DBpedia for Entity Linking and
Disambiguation
●
DBpedia Spotlight
●
web service to detect, disambiguate and link mentionings
of DBpedia resource occurrences in input text
●
uses two NLP datasets derived by DBpedia
⇒ topic signatures - tf/idf weighted term vectors
⇒ lexicalisations - alternative names for entities and
concepts
●
several other entity detection and linking services
targetting DBpedia entities:
AlchemyAPI, Ontos Semantic API, OpenCalais, Zemanta
DBpedia Tutorial 09.02.2015 http://dbpedia.org43
DBpedia for Entity Linking and
Disambiguation
DBpedia Tutorial 09.02.2015 http://dbpedia.org45
Linking DBpedia
target
dataset
predicate out-link cout
Freebase owl:sameAs 3 6000 000
YAGO2 rdf:type 18 100 000
UMBEL rdf:type 896 400
WordNet dbp:wordnet type 467 100
OpenCyc owl:sameAs 27 100
LinkedGeoData owl:sameAs 103 600
GeoNames owl:sameAs 86 500
●
community-curated links to various major and minor external
datasets:
●
Linked Data Web analysis with Sinditech measured
3 960 212 in-links to DBpedia (lower-bound)
statistics from (Lehmann et al. 2012)
DBpedia Tutorial 09.02.2015 http://dbpedia.org46
Linking DBpedia -
use cases for Linked DBpedia Data
●
correllate the accumulated Funding per year from EU to
member countries (from FTS) with the gross domestic
product of these countries (DBpedia)
●
correlate the share of metropolitan area above average used
for parks or other natural recreational areas in town and
cities led environmentalist (LinkedGeoData & DBpedia)
●
is there a town with town with no more than 15000
inhabitants in the area around Leipzig containing a church
with Catholic denomination, childcare, a primary shool and a
grammar school, not currently led by a politican from the
conservative party
DBpedia Tutorial 09.02.2015 http://dbpedia.org47
DBpedia internationalised
●
non-English versions of DBpedia offers
●
coverage of more entities
●
more detailed or up-to-date information for entities associated
with the particular coutries
●
international mapping community helps in provision of localized
dbpedia datasets for 125 languages
⇒ own IRI recipe http://<langcode>.dbpedia.org/resource/<thing>
●
15 DBpedia chapters: autonomous management of mapping,
organisation of local community, hosting of datasets and services
●
also canonicalized datasets: facts derived from localized
Wikipedias, but only statements for resources also present in
Englisch DBpedia
⇒ usage of default http://dbpedia.org/resource/ namespace
DBpedia Tutorial 09.02.2015 http://dbpedia.org48
DBpedia internationalised
DBpedia Tutorial 09.02.2015 http://dbpedia.org49
Related Work: Freebase
–extracts structured data from Wikipedia
–makes it available in RDF
Similarities:
–provides dumps of the extracted data
–provides APIs and endpoints to access the data
DBpedia Tutorial 09.02.2015 http://dbpedia.org50
Related Work: Freebase
Differences:
Freebase
- Freebase uses several
Sources –> higher
coverage
- Freebase can be directly
edited by users
- mainly run by Google
(discontiued)
Dbpedia
- RDF representation of Wikipedia
- hub on the Web of Data
- can be only indirectly edited by
modifying the content of
Wikipedia
- ongoing community effort
DBpedia Tutorial 09.02.2015 http://dbpedia.org51
Related Work: Wikidata
– Initialized by Wikimedia Germany e.V. in 2012
– free knowledge base about the world that can be read
– edited by humans and machines alike
– can offer a variety of statements from different sources
and dates
– does not offer the truth about things:
• (-) Berlin has a population of 3.5 million
• (+) Wikidata contains the statement about Berlin’s
population being 3.5 million as of 2011 according to
the German statistical office
– aim is to provide a single point of truth for facts in
Wikipedia across different language versions
DBpedia Tutorial 09.02.2015 http://dbpedia.org52
Current developments
●
Increased validation and curation process
(DBpedia+, RDFUnit)
●
ease creation of local DBpedia SPARQL endpoints
(Debian packaging, docker images of triple store
and dataset selection, automatic import)
●
novel more intuitive and feature rich browsing
interfaces
⇒ add corrections in place in LD viewer interfaces (?)
DBpedia Tutorial 09.02.2015 http://dbpedia.org53
How you can get involved
–set up new mirrors and endpoints of Dbpedia
–revise mappings and/or write new ones
–help improving the ontology
–get involved with the Irish/Gaelic chapter
bianca.pereira@insight-centre.org
caoilfhionn.lane@insight-centre.org
–edit Wikipedia
DBpedia Tutorial 09.02.2015 http://dbpedia.org54
Further Reading: Website
landing page:
http://dbpedia.org/About
overview over datasets (also info on localized
datasets):
http://wiki.dbpedia.org/Datasets
DBpeda data access oveview:
http://wiki.dbpedia.org/OnlineAccess
DBpedia Tutorial 09.02.2015 http://dbpedia.org55
Further Reading: Publications
2007
T: DBpedia: A Nucleus for a Web of Open Data
A: Auer, Bizer, Kobilarov, Lehmann,Cyganiak, Ives
http://www.cis.upenn.edu/~zives/research/dbpedia.pdf
2009
T: DBpedia - A Crystallization Point for the Web of Data
A: Bizer, Lehmann, Kobilarov, Auer, Becker, Cyganiak, Hellmann
http://jens-lehmann.org/files/2009/dbpedia_jws.pdf
2012
T: DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia
A: Lehmann, Isele, Jkob , Jentzsch, Kontokostas,Hellmann, Morsey, van Kleef, Auer,
Bizer
http://www.semantic-web-journal.net/system/files/swj499.pdf
DBpedia Tutorial 09.02.2015 http://dbpedia.org56
Further Reading: W3C Specs
RDF:
http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/
RDFS: http://www.w3.org/TR/rdf-schema/
OWL 2: http://www.w3.org/TR/owl2-overview/
SPARQL Query Language:
http://www.w3.org/TR/sparql11-query/
SPARQL Protocol:
http://www.w3.org/TR/2013/REC-sparql11-protocol-
20130321/
DBpedia Tutorial 09.02.2015 http://dbpedia.org57
Further Reading: Browsing
DBpedia VAD: http://dbpedia.org/page/DBpedia
DBpedia Facets: http://dbpedia.org/fct/
new DBpedia frontend:
http://de.dbpedia.org/page/DBpedia (get an impression to the
German DBpedia version)
https://github.com/lukovnikov/ldviewer (source code)
Context platform:
http://context.aksw.org/app/hub.php?corpus=6&action=facets
(online demo to browse LOD2 Blog)
http://context.aksw.org/app/ (project home)
DBpedia Tutorial 09.02.2015 http://dbpedia.org58
Further Reading: SPARQL
DBpedia Snorql SPARQL interface (DBP-en):
http://dbpedia.org/snorql/
John Cleese Query in Snorql: http://bit.ly/1zog24A
EU Funding vs. Country GDB:
https://gist.github.com/neradis/0ca7a41c408280c0d69e
Flint SPARQL Editor:
http://openuplabs.tso.co.uk/demos/sparqleditor (online
demo)
https://github.com/TSO-Openup/FlintSparqlEditor (source
code, checkout and run)
DBpedia Tutorial 09.02.2015 http://dbpedia.org59
Further Reading:
pupular RDF/OWL frameworks
Sesame (Java): http://rdf4j.org/
Jena (Java): http://jena.apache.org/index.html
RDFLib (Python): http://code.google.com/p/rdflib/
DBpedia Tutorial 09.02.2015 http://dbpedia.org60
Goodbye!
Thank you for you interest in DBpedia!

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
Open Data Support
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for Java
Aleksander Pohl
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Simplilearn
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
Saswat Padhi
 

Was ist angesagt? (20)

RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge Graph
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)
 
Données liées et Web sémantique : quand le lien fait sens.
Données liées et Web sémantique : quand le lien fait sens. Données liées et Web sémantique : quand le lien fait sens.
Données liées et Web sémantique : quand le lien fait sens.
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for Java
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Inference on the Semantic Web
Inference on the Semantic WebInference on the Semantic Web
Inference on the Semantic Web
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
Semantic web
Semantic webSemantic web
Semantic web
 
FHIR API for Java programmers by James Agnew
FHIR API for Java programmers by James AgnewFHIR API for Java programmers by James Agnew
FHIR API for Java programmers by James Agnew
 
HL7 Fhir for Developers
HL7 Fhir for DevelopersHL7 Fhir for Developers
HL7 Fhir for Developers
 
HL7 New Zealand: FHIR for developers
HL7 New Zealand: FHIR for developersHL7 New Zealand: FHIR for developers
HL7 New Zealand: FHIR for developers
 

Ähnlich wie DBpedia Tutorial - Feb 2015, Dublin

The Semantic Web Exists. What Next?
The Semantic Web Exists. What Next?The Semantic Web Exists. What Next?
The Semantic Web Exists. What Next?
Anna Fensel
 
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprintSw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
okeee
 
Rober stephenson
Rober stephensonRober stephenson
Rober stephenson
NASAPMC
 
Vila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-reduxVila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-redux
LIS EPI Meeting
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and Wikimedia
Nick Sheppard
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Olivier Grisel
 

Ähnlich wie DBpedia Tutorial - Feb 2015, Dublin (20)

Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
The Semantic Web Exists. What Next?
The Semantic Web Exists. What Next?The Semantic Web Exists. What Next?
The Semantic Web Exists. What Next?
 
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprintSw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Rober stephenson
Rober stephensonRober stephenson
Rober stephenson
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutions
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Linked Data
Linked DataLinked Data
Linked Data
 
Towards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA projectTowards long-term preservation of linked data - the PRELIDA project
Towards long-term preservation of linked data - the PRELIDA project
 
Vila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-reduxVila LOD-innovacion- bib-semweb-redux
Vila LOD-innovacion- bib-semweb-redux
 
Contributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and WikimediaContributing to the global commons: Repositories and Wikimedia
Contributing to the global commons: Repositories and Wikimedia
 
Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications. Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications.
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
Linked Open Data stuff
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 

Kürzlich hochgeladen

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 

Kürzlich hochgeladen (20)

Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 

DBpedia Tutorial - Feb 2015, Dublin

  • 1. DBpedia Tutorial 09.02.2015 http://dbpedia.org1 Creating Knowledge out of Interlinked Data Markus Ackermann, Markus Freudenberg WG Agile Knowledge and Semantic Web Universität Leipzig DBpedia Extraction of Knowledge from Wikipedia
  • 2. DBpedia Tutorial 09.02.2015 http://dbpedia.org2 Wikipedia Wikipedia coverage of the London bombing on July 7, 2005 –the first Wikipedia entry appeared in just 18 minutes –2500 users provided a 14 page article in only 12 hours –far more detailed than any other news source [Tapscott, D. Williams 2006]
  • 3. DBpedia Tutorial 09.02.2015 http://dbpedia.org3 Wikipedia Wikipedia articles: –4,7 mio. Articles; 780 article additions per day –are highly topical –containing only few errors, which can easily be revised –cover often very specific content → Wikipedia is the knowledge compendium of humanity.
  • 4. DBpedia Tutorial 09.02.2015 http://dbpedia.org4 Semantic Web –Web 3.0 web technology –a way of linking data between systems or entities –allows for rich, self-describing interrelations of data available across the globe –open up the web of data to artificial intelligence processes –encourage companies, organisations and individuals to publish their data freely, in an open standard format –encourage businesses to use data already available on the web (data give/take)
  • 5. DBpedia Tutorial 09.02.2015 http://dbpedia.org5 Linked Data The means of populating the Semantic Web is Linked Data. (introduced by Tim Berners-Lee) Four simple rules : –Use URIs as names for things –Use HTTP URIs so that people can look up those names –When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) –Include links to other URIs. so that they can discover more things.
  • 6. DBpedia Tutorial 09.02.2015 http://dbpedia.org6 5 ★ Linked Open Data
  • 7. DBpedia Tutorial 09.02.2015 http://dbpedia.org7 benefits of using Linked Data Consumer View - link data from any other place in the web - discover more related data while consuming data - reuse parts of the data - reuse existing tools and libraries - combine data safely with other data - query data over different repositories Publisher View - make your data discoverable - increase the value of your data (by linking it) - have fine-granular control over the data items and optimise their access - design data to fit your domain knowledge
  • 8. DBpedia Tutorial 09.02.2015 http://dbpedia.org8 What's DBpedia? – DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. – DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. – Common goal with WikiData but, different approach
  • 9. DBpedia Tutorial 09.02.2015 http://dbpedia.org9 What's DBpedia? –DBpedia project was started in 2006 –has been a key factor for the success of the Linked Open Data initiative – serves as an interlinking hub for other data sets –DBpedia provides a testbed serving real data spanning various domains –In more than 120 language editions
  • 10. DBpedia Tutorial 09.02.2015 http://dbpedia.org10 Where is Wikipedia information useful? „Which films starred John Cleese without any other members of Monty Python?“ „What have Dublin and Leipzig in common?“  „Which Software products are developed by an organisation founded in California?“ „Which populated places in Germany are below sea level?“
  • 11. DBpedia Tutorial 09.02.2015 http://dbpedia.org11 Where is Wikipedia information useful? ● as terminology and concept repository and fact source for Entity Linking and Disambiguation: The series follows the adventures of a space-faring crew on board the starship USS Enterprise (NCC-1701-D), the fifth Federation vessel to bear the name and registry and the seventh starship by that name The Enterprise is commanded by Captain Jean-Luc Picard and is staffed by first officer Commander William Riker, operations manager Data, security chief Tasha Yar, ship's counselor Deanna Troi, chief medical officer Dr. Beverly Crusher, conn officer Lieutenant Geordi La Forge, and junior officer Lieutenant Worf. ⇒ no company, no aircraft carrier, no satellite ⇒ correlate the mentionings and concept starship ⇒ Star Trek rank, contemporary or past military or law enforcement
  • 12. DBpedia Tutorial 09.02.2015 http://dbpedia.org12 Why search engines aren't always enough „Which films starred John Cleese without any other members of Monty Python?“
  • 13. DBpedia Tutorial 09.02.2015 http://dbpedia.org13
  • 14. DBpedia Tutorial 09.02.2015 http://dbpedia.org14
  • 15. DBpedia Tutorial 09.02.2015 http://dbpedia.org15 What is needed to do better? ● ontological represantation of entities and facts „An ontology is a specification of a conceptualization.“ (Gruber, 1993) ⇒ formal description of concepts and relationships
  • 16. DBpedia Tutorial 09.02.2015 http://dbpedia.org16 What is needed to do better? ● ontological represantation of entities and facts ● well-defined taxonomy of entity types ● assertions about entities in and their relations A British Comedy is a kind of Comedy. A Comedy is a kind of Film. A British Comedy is a kind of Film. Clockwise is a British Comedy. John Cleese stars Clockwise. John Cleese stars a Film. ● thoroughly specified, machine-actionable, but flexible formalism for representation
  • 17. DBpedia Tutorial 09.02.2015 http://dbpedia.org17 A brief introduction to RDF Resource Description Framework (W3C Standard) ● flexible language and data model for representation of information ● based on (S,P,O) triples denoting simple assertions S – subject P – property O – object S   I∊ ∪B P   ∊ I O   ∊ I∪B∪L I – URIs/IRIs; B – blank nodes; L – Literals ● URIs/IRIs of named entities are: ● unambigious, but non-unique identifiers of a resource ● often dereferencable (in the Semantic Web) ● aggregate of triple-assertions constitutes a directed graph with typed edges
  • 18. DBpedia Tutorial 09.02.2015 http://dbpedia.org18 A brief introduction to RDF
  • 19. DBpedia Tutorial 09.02.2015 http://dbpedia.org19 DBpedia - motivation and use cases an RDF view of structured Wikipedia information enables: ● sophisitated queries ⇒ cross-referencing facts of entities ⇒ filtering of entities based on their types and fact assertions ● combining facts from Wikipedia with machine- actionable knowledge from other structured datasets (Geodata, Yellowpages, WordNet, ...)
  • 20. DBpedia Tutorial 09.02.2015 http://dbpedia.org20 Another take on Question Answering „Which films starred John Cleese without any other members of Monty Python?“
  • 21. DBpedia Tutorial 09.02.2015 http://dbpedia.org21
  • 22. DBpedia Tutorial 09.02.2015 http://dbpedia.org22 DBpedia - contents and datasets ● Wikipedia article ⇔ DBpedia resource http://en.wikipedia.org/wiki/Monty_Python ⇔ http://dbpedia.org/resource/Monty_Python ● mapping-based types and facts governed by the DBpedia Ontology
  • 23. DBpedia Tutorial 09.02.2015 http://dbpedia.org23 DBpedia - contents and datasets ● 4.58 mio. entities and 583 mio. triples (Englisch DBpedia 2014) 131,2 mio. fact assertions (devived from info boxes) 168,5 mio. triples representing Wikipedia structure 57,1 mio. links to external datasets ● DBpedia resources are categorised in several manners: ● by Wikipedia categories (represented in SKOS) ● by YAGO classification ● by links to WordNet Synsets ● by assignment of classes from the DBpedia ontology ● Provenance meta-data ⇒ From which part of which Wikipedia page was a triple derived?
  • 24. DBpedia Tutorial 09.02.2015 http://dbpedia.org24 Mappings Wiki a community effort to: –develop an ontology schema –provide mappings from Wikipedia Infoboxes properties to this ontology → creating an alignment between Wikipedia and Dbpedia → eliminating name variations in properties and classes → big boost for Precision
  • 25. DBpedia Tutorial 09.02.2015 http://dbpedia.org25 DBpedia Ontology cross-domain ontology –maintained and extended by the community in the DBpedia Mappings Wiki –manually created based on the most commonly used infoboxes –currently covers 685 classes which form a subsumption hierarchy and are described by 2,795 different properties –subsumption hierarchy with a maximal depth of 5 –is maintained and extended by the community in the DBpedia Mappings Wiki
  • 26. DBpedia Tutorial 09.02.2015 http://dbpedia.org26 Dbpedia Ontology Extract
  • 27. DBpedia Tutorial 09.02.2015 http://dbpedia.org27 Wikipedia articles – Wikipedia articles consist mostly of free text – also comprise various types of structured information – including: infobox templates, categorisation information, images, geo-coordinates, links to external web pages, disambiguation pages, redirects between pages, other language links – Title – Abstract – Infoboxes – Geo- coordinates – Categories – Images article outline –Links »other language versions »other Wikipedia pages »To the Web »Redirects »Disambiguations
  • 28. DBpedia Tutorial 09.02.2015 http://dbpedia.org28 Structure in Wikipedia Title Abstract Infoboxes Geo-coordinates Categories Images Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations
  • 29. DBpedia Tutorial 09.02.2015 http://dbpedia.org29 {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부 산 광 역 시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} dbp:Busan dbp:title ″Busan Metropolitan City″ dbp:Busan dbp:hangul ″ 부 산 광 역 시 ″ @Hang dbp:Busan dbp:area_km2 ″763.46“^xsd:float dbp:Busan dbp:pop ″3635389“^xsd:int dbp:Busan dbp:region dbp:Yeongnam dbp:Busan dbp:dialect dbp:Gyeongsang ... infobox encondig
  • 30. DBpedia Tutorial 09.02.2015 http://dbpedia.org30 heterogeneiety in infoboxes
  • 31. DBpedia Tutorial 09.02.2015 http://dbpedia.org31 Björk (Musician) Occupation = Musician, Actor Born = 21.12.1965, Reykjavík Brown (Prime Minister) office = Prime Minister of the UK birth_date = 20.4.1951 birth_place = Govan Romero (Actor) occupation = Actor, Editor birthdate = 4.2.1940 birthplace = New York
  • 32. DBpedia Tutorial 09.02.2015 http://dbpedia.org32 DBpedia Extraction Framework DIEF - DBpedia Information Extraction Framework –extracts structured information from Wikipedia and turns it into a rich knowledge base –Mapping-Based Infobox Extraction, Raw Infobox Extraction, Feature Extraction, Statistical Extraction –Hosted on GitHub –Written in Scala & Java
  • 33. DBpedia Tutorial 09.02.2015 http://dbpedia.org33
  • 34. DBpedia Tutorial 09.02.2015 http://dbpedia.org34 Dbpedia Live –Wikipedia articles are continuously revised at a very high rate –English Wikipedia, in June 2013, had approximately 3.3 million edits per month (^= 77 edits per minute) –Dbpedia Live was developed to keep Dbpedia in synchronization with Wikipedia –works on a continuous stream of updates from Wikipedia and processes that stream on the fly
  • 35. DBpedia Tutorial 09.02.2015 http://dbpedia.org35 Need for validation ● over 3 mio. violation
  • 36. DBpedia Tutorial 09.02.2015 http://dbpedia.org36 Acessing DBpedia - Browsing ● official DBpedia mirror http://dbpedia.org ⇒ run on Virtuoso ⇒ point & click browsing via DBpedia VAD ⇒ faceted search with Virtuoso Facets
  • 37. DBpedia Tutorial 09.02.2015 http://dbpedia.org37 Acessing DBpedia - SPARQL ● official SPARQL endpoint http://dbpedia.org/sparql ● ⇒ subject to a fair use policy (limited query runtime) ● ⇒ iSPARQL frontend (interactive query building) ● ⇒ Snorql frontend ● ⇒ query with any SPARQL compliant tool or API
  • 38. DBpedia Tutorial 09.02.2015 http://dbpedia.org38 Querying RDF with SPARQL ● SPARQL Protocol and RDF Query Language ⇒ graph patterns as set of triples (with variables) ⇒ successful matches of graph patters generate bindings in (sub-)query solutions
  • 39. DBpedia Tutorial 09.02.2015 http://dbpedia.org39 Querying RDF with SPARQL ● SPARQL Protocol and RDF Query Language ⇒ graph patterns as set of triples (with variables) ⇒ successful matches of graph patters generate bindings in (sub-)query solutions ● different result types for queries SELECT ⇒ bindings, ASK ⇒ true/false, CONSTRUCT ⇒ new graph ● combinators and modifiers for basic graph patterns ⇒ UNION, FILTER, MINUS, FILTER (NOT) EXISTS ● result set modifies LIMIT, OFFSET, DISTINCT, ORDER BY ● numerous operators and operators for resource and literal values ● many additions in 1.1 revision: grouping & aggregates, regular property path expr., sub-queries
  • 40. DBpedia Tutorial 09.02.2015 http://dbpedia.org40 SPARQL Query Example
  • 41. DBpedia Tutorial 09.02.2015 http://dbpedia.org41 SPARQL Tooling ● FlintSparqlEditor: Javascript SPARQL Editor ● syntax highlighting, code assistance ● auto-completion for properties and classes (for small datasets) ● Protegé: full-fledged ontology editor ● good to get an overview of ontologies backing datasets ● two SPARQL plug-ins (one supporting entailment) ● curl or your favourite simple REST API ● allows for simple testing queries from any text editor with SPARQL syntax support (e.g. Emacs, Vim, Sublime Text) $curl -H 'Accept: application/json' --data-urlencode "query=$(cat query.sparql)" http://dbpedia.org/sparql
  • 42. DBpedia Tutorial 09.02.2015 http://dbpedia.org42 DBpedia for Entity Linking and Disambiguation ● DBpedia Spotlight ● web service to detect, disambiguate and link mentionings of DBpedia resource occurrences in input text ● uses two NLP datasets derived by DBpedia ⇒ topic signatures - tf/idf weighted term vectors ⇒ lexicalisations - alternative names for entities and concepts ● several other entity detection and linking services targetting DBpedia entities: AlchemyAPI, Ontos Semantic API, OpenCalais, Zemanta
  • 43. DBpedia Tutorial 09.02.2015 http://dbpedia.org43 DBpedia for Entity Linking and Disambiguation
  • 44.
  • 45. DBpedia Tutorial 09.02.2015 http://dbpedia.org45 Linking DBpedia target dataset predicate out-link cout Freebase owl:sameAs 3 6000 000 YAGO2 rdf:type 18 100 000 UMBEL rdf:type 896 400 WordNet dbp:wordnet type 467 100 OpenCyc owl:sameAs 27 100 LinkedGeoData owl:sameAs 103 600 GeoNames owl:sameAs 86 500 ● community-curated links to various major and minor external datasets: ● Linked Data Web analysis with Sinditech measured 3 960 212 in-links to DBpedia (lower-bound) statistics from (Lehmann et al. 2012)
  • 46. DBpedia Tutorial 09.02.2015 http://dbpedia.org46 Linking DBpedia - use cases for Linked DBpedia Data ● correllate the accumulated Funding per year from EU to member countries (from FTS) with the gross domestic product of these countries (DBpedia) ● correlate the share of metropolitan area above average used for parks or other natural recreational areas in town and cities led environmentalist (LinkedGeoData & DBpedia) ● is there a town with town with no more than 15000 inhabitants in the area around Leipzig containing a church with Catholic denomination, childcare, a primary shool and a grammar school, not currently led by a politican from the conservative party
  • 47. DBpedia Tutorial 09.02.2015 http://dbpedia.org47 DBpedia internationalised ● non-English versions of DBpedia offers ● coverage of more entities ● more detailed or up-to-date information for entities associated with the particular coutries ● international mapping community helps in provision of localized dbpedia datasets for 125 languages ⇒ own IRI recipe http://<langcode>.dbpedia.org/resource/<thing> ● 15 DBpedia chapters: autonomous management of mapping, organisation of local community, hosting of datasets and services ● also canonicalized datasets: facts derived from localized Wikipedias, but only statements for resources also present in Englisch DBpedia ⇒ usage of default http://dbpedia.org/resource/ namespace
  • 48. DBpedia Tutorial 09.02.2015 http://dbpedia.org48 DBpedia internationalised
  • 49. DBpedia Tutorial 09.02.2015 http://dbpedia.org49 Related Work: Freebase –extracts structured data from Wikipedia –makes it available in RDF Similarities: –provides dumps of the extracted data –provides APIs and endpoints to access the data
  • 50. DBpedia Tutorial 09.02.2015 http://dbpedia.org50 Related Work: Freebase Differences: Freebase - Freebase uses several Sources –> higher coverage - Freebase can be directly edited by users - mainly run by Google (discontiued) Dbpedia - RDF representation of Wikipedia - hub on the Web of Data - can be only indirectly edited by modifying the content of Wikipedia - ongoing community effort
  • 51. DBpedia Tutorial 09.02.2015 http://dbpedia.org51 Related Work: Wikidata – Initialized by Wikimedia Germany e.V. in 2012 – free knowledge base about the world that can be read – edited by humans and machines alike – can offer a variety of statements from different sources and dates – does not offer the truth about things: • (-) Berlin has a population of 3.5 million • (+) Wikidata contains the statement about Berlin’s population being 3.5 million as of 2011 according to the German statistical office – aim is to provide a single point of truth for facts in Wikipedia across different language versions
  • 52. DBpedia Tutorial 09.02.2015 http://dbpedia.org52 Current developments ● Increased validation and curation process (DBpedia+, RDFUnit) ● ease creation of local DBpedia SPARQL endpoints (Debian packaging, docker images of triple store and dataset selection, automatic import) ● novel more intuitive and feature rich browsing interfaces ⇒ add corrections in place in LD viewer interfaces (?)
  • 53. DBpedia Tutorial 09.02.2015 http://dbpedia.org53 How you can get involved –set up new mirrors and endpoints of Dbpedia –revise mappings and/or write new ones –help improving the ontology –get involved with the Irish/Gaelic chapter bianca.pereira@insight-centre.org caoilfhionn.lane@insight-centre.org –edit Wikipedia
  • 54. DBpedia Tutorial 09.02.2015 http://dbpedia.org54 Further Reading: Website landing page: http://dbpedia.org/About overview over datasets (also info on localized datasets): http://wiki.dbpedia.org/Datasets DBpeda data access oveview: http://wiki.dbpedia.org/OnlineAccess
  • 55. DBpedia Tutorial 09.02.2015 http://dbpedia.org55 Further Reading: Publications 2007 T: DBpedia: A Nucleus for a Web of Open Data A: Auer, Bizer, Kobilarov, Lehmann,Cyganiak, Ives http://www.cis.upenn.edu/~zives/research/dbpedia.pdf 2009 T: DBpedia - A Crystallization Point for the Web of Data A: Bizer, Lehmann, Kobilarov, Auer, Becker, Cyganiak, Hellmann http://jens-lehmann.org/files/2009/dbpedia_jws.pdf 2012 T: DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia A: Lehmann, Isele, Jkob , Jentzsch, Kontokostas,Hellmann, Morsey, van Kleef, Auer, Bizer http://www.semantic-web-journal.net/system/files/swj499.pdf
  • 56. DBpedia Tutorial 09.02.2015 http://dbpedia.org56 Further Reading: W3C Specs RDF: http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/ RDFS: http://www.w3.org/TR/rdf-schema/ OWL 2: http://www.w3.org/TR/owl2-overview/ SPARQL Query Language: http://www.w3.org/TR/sparql11-query/ SPARQL Protocol: http://www.w3.org/TR/2013/REC-sparql11-protocol- 20130321/
  • 57. DBpedia Tutorial 09.02.2015 http://dbpedia.org57 Further Reading: Browsing DBpedia VAD: http://dbpedia.org/page/DBpedia DBpedia Facets: http://dbpedia.org/fct/ new DBpedia frontend: http://de.dbpedia.org/page/DBpedia (get an impression to the German DBpedia version) https://github.com/lukovnikov/ldviewer (source code) Context platform: http://context.aksw.org/app/hub.php?corpus=6&action=facets (online demo to browse LOD2 Blog) http://context.aksw.org/app/ (project home)
  • 58. DBpedia Tutorial 09.02.2015 http://dbpedia.org58 Further Reading: SPARQL DBpedia Snorql SPARQL interface (DBP-en): http://dbpedia.org/snorql/ John Cleese Query in Snorql: http://bit.ly/1zog24A EU Funding vs. Country GDB: https://gist.github.com/neradis/0ca7a41c408280c0d69e Flint SPARQL Editor: http://openuplabs.tso.co.uk/demos/sparqleditor (online demo) https://github.com/TSO-Openup/FlintSparqlEditor (source code, checkout and run)
  • 59. DBpedia Tutorial 09.02.2015 http://dbpedia.org59 Further Reading: pupular RDF/OWL frameworks Sesame (Java): http://rdf4j.org/ Jena (Java): http://jena.apache.org/index.html RDFLib (Python): http://code.google.com/p/rdflib/
  • 60. DBpedia Tutorial 09.02.2015 http://dbpedia.org60 Goodbye! Thank you for you interest in DBpedia!