SlideShare a Scribd company logo
1 of 144
Download to read offline
Linked Open Data
SIKS course on Data Science

May 20, 2016 Vught.

Laura Hollink
Why do we create and use Linked Open Data?
Example questions from
the humanities and
social sciences
How did the debate about
the financial crisis in
Greece develop?
Searching the proceedings of the EU Parliament
"Greece" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
050100150200
1999 2000 2001 2001 2002 2003 2004 2005 2006 2006 2007 2008 2009 2010 2010 2011 2012 2013
Searching through newspaper archives
Mentions of “Griekenland” in the Dutch newspaper the Telegraaf.
Search volumes on a search engine
Query = “Greece”
http://www.google.com/trends
Search volumes on a search engine
Query = “Greece”
http://www.google.com/trends
We need access to data. Analysing
them gives us some useful insight.
But to answer the question properly
we would need to combine sources
and do more complex queries.
Why do we create and use Linked Open Data?
Example question 2 

Which political debate in the
post-war period has attracted
most media attention?
“De Indonesische Quaestie"
“De Indonesische Quaestie"
To answer this question we need to
go through all newspaper articles
about all political debates.
-> we need access to combined
data sources, we need
structured queries.
Why do we create and use Linked Open Data?
Why do we create and use Linked Open Data?
Example question 3
What are the differences
between different media?

Example question 4
Has the coverage changed
over time?
Research goals and research questions
Our goal is to build an infrastructure to answer these kinds of questions.

1. How do we automatically link heterogeneous datasets?

2. How do we interpret links between datasets of different quality and certainty?

3. What can we conclude from usage statistics on these datasets?

4. Can we design interfaces that allow scholars to study the datasets

• including the links between them?

• while assessing the reliability of the findings?
Research goals and research questions
Our goal is to build an infrastructure to answer these kinds of questions.

1. How do we automatically link heterogeneous datasets?

2. How do we interpret links between datasets of different quality and certainty?

3. What can we conclude from usage statistics on these datasets?

4. Can we design interfaces that allow scholars to study the datasets

• including the links between them?

• while assessing the reliability of the findings?
Data Science - Big Data - Linked Open Data
Table of Contents
1. What is Linked Open Data (LOD)
2. Creating LOD
1. How to discover links
2. How to represent links on the Web
3. How to evaluate links
3. Access to LOD (from both the server and the client
perspective)
What is Linked Open Data?
What is Linked Open Data?
What is Linked Open Data?
A method of publishing structured data on the Web
in such a way that it can be linked and queried
by computers as well as humans.
The Web of Documents
The Web of Documents
• Documents	
  identified	
  by	
  URIs	
  (html,	
  pdf,	
  images,	
  movies,	
  etc.)	
  
• with	
  structured	
  information	
  for	
  humans	
  (tables,	
  headers)	
  and	
  
• with	
  hyperlinks	
  between	
  them	
  
• The	
  data	
  is	
  not	
  machine	
  readable,	
  meant	
  for	
  humans	
  
• structure	
  is	
  implicit	
  (what	
  do	
  the	
  columns	
  of	
  a	
  table	
  mean?)	
  
• links	
  are	
  not	
  typed	
  (what	
  is	
  the	
  relation	
  between	
  two	
  documents?)	
  
The Web of Data
The Web of Data
• Everything	
  identified	
  by	
  URIs	
  (not	
  just	
  documents,	
  but	
  also	
  classes,	
  
instances,	
  relations/links)	
  
• The	
  data	
  is	
  machine	
  readable:	
  	
  
• in	
  formal	
  languages	
  (RDF,	
  RDFS,	
  OWL,	
  SKOS)	
  	
  
• which	
  enable	
  machines	
  to	
  do	
  reasoning,	
  i.e.	
  infer	
  new	
  statements	
  
from	
  inserted	
  statements.	
  
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Thing Type Population Airport
Amsterdam City 1364422 Schiphol
…. … …. …
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Differences:

• Statements can be distributed over the web

• Non-unique naming assumption

• Open World assumption

• Everyone can say anything about anything
Thing Type Population Airport
Amsterdam City 1364422 Schiphol
…. … …. …
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Examples of URIs on the Web of Data
• documents:
• http://vu.nl/index.html

• http://example.org/cities#Leuven

• real world objects (a book in the library, a person)
• isbn://5031-4444-333

• http://eyaloren.org/foaf.rdf#me

• concepts:
• http://cyc.org/concept/Mammal 

• http://cyc.org/concept/Dog 

• www.w3.org/2006/03/wn/wn20/instances/synset-anniversary-noun-1

• relations:
• http://purl.org/linkedpolitics/vocabulary/speaker
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF example in Turtle syntax:
<bob#me>
a foaf:Person ;
foaf:knows <alice#me> ;
schema:birthDate "1990-07-04"^^xsd:date ;
foaf:topic_interest wd:Q12418 .
Vocabulary definition and reasoning with RDFS
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
C rdfs:subClassOf B
r rdf:type C
THEN
r rdf:type B
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
<bob#me> a foaf:Agent .
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
<bob#me> a foaf:Agent .
Standard meaning
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
<alice#me> rdf:type foaf:Person .
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
<alice#me> rdf:type foaf:Person .
Standard meaning
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
Query: :JamesDean ?what :Giant.
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
Query: :JamesDean ?what :Giant.
Answer: :playedIn
Linked Open Data
A method of publishing on the Web of Data: openly
available, in RDF, with links to other datasets.
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Linked Open Data
A method of publishing on the Web of Data: openly
available, in RDF, with links to other datasets.
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Creating Linked Open Data
in the Talk of Europe project:
Discovering links, knowledge representation
Creating Linked Open Data
in the Talk of Europe project:
Discovering links, knowledge representation
The European Parliament as Linked Open Data
Laura Hollink	 	 Centrum Wiskunde & Informatica, Amsterdam
Astrid van Aggelen 	 VU University Amsterdam
Martijn Kleppe	 	 Erasmus University Rotterdam
Henri Beunders Erasmus University Rotterdam
Jill Briggeman Erasmus University Rotterdam
Max Kemman	 	 University of Luxembourg
Talk of Europe goals
• To publish the entire plenary debates of the European
Parliament as Linked Open Data

• To improve access to the data

• To enable large scale analysis across time spans.

‣To residents of the European Union access to the proceedings
of the European parliament is a formal right.
A. van Aggelen, L. Hollink, M.
Kemman, M. Kleppe & H. Beunders.
The debates of the European
Parliament as Linked Open Data.
Semantic Web Journal. In press, 2016.
1. Data in RDF
1. Data in RDF
1. Data in RDF
14M RDF statements about the 30K
speeches in 23 languages by 3K
speakers in 1K session days that
were held in the EU parliament
between 1999 and 2014
2. Links to external datasets
•
2. Links to external datasets
•
2. Links to external datasets
•
Example 1: speeches that contain a certain keyword
Query: all speeches that contain the phrase “open data”
…. So let us go for open data, let us
go for utilisation of all the instruments
available to that end! …..
…. but there too governments are
encouraging the use of open data to
increase transparency, accountability
and citizen participation ….
…. We already have many open data
projects in the Member States and
local authorities…..
Example 2: speeches that contain a certain
keyword by date
"Slovenia" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
020406080100
1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
Example 2: speeches that contain a certain
keyword by date
"Slovenia" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
020406080100
1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
Example 2: speeches that contain a certain keyword
by date
Mentions of 'human rights'
dates
Frequency
0200400600800
1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
Example 3: speeches that contain a certain keyword
by country
AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK
Mentions of 'human rights' by country
01000200030004000500060007000
Example 4: the number of speeches per EU
country
SELECT ?c (COUNT(?c) as ?count) 

WHERE { 

	 ?x rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/plenary/Speech>. 

	 ?x <http://purl.org/linkedpolitics/vocabulary#speaker> ?p. 

	 ?p <http://purl.org/linkedpolitics/vocabulary#countryOfRepresentation> ?c

} GROUP BY ?c LIMIT 50
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Integrate data from
the EU parliament
with external datasets
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
• String matching is the most important feature in the linking process.

• “nearly all [alignment systems] use a string similarity metric” [12]

• stopping and stemming is not helpful! Nor is using WordNet synonyms. [12]
[12] Cheatham, M., & Hitzler, P. String
similarity metrics for ontology alignment.
ISWC 2013.
http://www.dbpedia.org/resource/Judith_Sargentini
Linking Members of Parliament to Wikipedia /
DBpedia
• String matching is the most important feature in the linking process.

• “nearly all [alignment systems] use a string similarity metric” [12]

• stopping and stemming is not helpful! Nor is using WordNet synonyms. [12]
[12] Cheatham, M., & Hitzler, P. String
similarity metrics for ontology alignment.
ISWC 2013.
http://www.dbpedia.org/resource/Judith_Sargentini
How to relate a speech to a speaker and party?
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
1. A person might be a member of more than one party (at different times)
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
1. A person might be a member of more than one party (at different times)
2. Since there is no link between a speech and a party, queries for all speeches
spoken by the members of a certain party become very complicated.
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:spokenAs
lpv:speaker
lpv:spokenAs
rdf:type
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:spokenAs
lpv:speaker
lpv:spokenAs
rdf:type
Note: this is a common “design pattern”
referred to as n-ary relations or
relations as classes
Intermezzo: one-question Quiz
Reasoning on the Web of Data
Question: What can we conclude from this graph?

A. Stihler is a member of exactly 3 parties

B. Stihler is a member of at least 3 parties

C. Stihler is a member of at most 3 parties

D. None of the above

E. All of the above

F. Other, namely ….
http://purl.org/linkedpolitics/EUmember_4545 "Catherine Stihler"foaf:name
http://purl.org/linkedpolitics/EUParty/PES
http://dbpedia.org/resource/
Party_of_European_Socialists
http://dbpedia.org/resource/
Progressive_Alliance_of_Socialists_and_Democrats
:memberOf
:memberOf
:memberOf
Creating Linked Open Data
in the PoliMedia project:
Discovering links, knowledge representation, evaluation
Creating Linked Open Data
in the PoliMedia project:
Discovering links, knowledge representation, evaluation
Linking government data
to news data
Which political debate in
the post-war period has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Transcriptions of all 9,294
meetings of the Dutch
parliament between
1945-1995, consisting of
1,208,903 speeches.
Roughly 1.8 Million news
bulletins between
1937-1984

(We only use 1945-1995)
Archives of hundreds of
newspaper with tons of
newspaper issues or 10’s
of Millions of articles
between 1618-1995.

(We only use 1945-1995)
Transcriptions of all
meetings of the
European Parliament
between 1999 and
2014.
Links in PoliMedia
is about
• 3 Million links
Discovering links between politics and news
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Step 2: generate links
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Intuition 1: The name of the speaker should
appear in the article and the article should
be published within a week of the debate
Step 2: generate links
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Intuition 1: The name of the speaker should
appear in the article and the article should
be published within a week of the debate
Intuition 2: the more the article and the
speech overlap in terms of topics and
named entities, the more they are related.
Representation of links
:speech123:newsArticle456 :isAbout
Representation of links
• Note: this is another
example of
the“design pattern”
referred to as n-ary
relations or relations
as classes!

• It allows us to save
provenance
information about
the statements we
create.
:speech123:newsArticle456 :isAbout
Representation of links
• Note: this is another
example of
the“design pattern”
referred to as n-ary
relations or relations
as classes!

• It allows us to save
provenance
information about
the statements we
create.
:speech123:newsArticle456 :isAbout
:speech123
:newsArticle456
:link001
01-02-2013 :PoliMedia_Linking_Engine
:quotes
:concept1
:concept2
link type
:madeBy:creationDate
Evaluation of links
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
2. Comparison to a reference linkset

• precision and recall

• used in OAEI on the SEALS platform

• more expensive if a reference alignment has to be
created (but: crowd sourcing!)
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
2. Comparison to a reference linkset

• precision and recall

• used in OAEI on the SEALS platform

• more expensive if a reference alignment has to be
created (but: crowd sourcing!)
3. End-to-end evaluation (a.k.a. evaluating an application
that uses the mappings)

• arguably the best method!

• need to have access to an application + users
Evaluation of links: beyond precision / recall
B
C
r
A
data level
ontology / vocabulary /
schema level
Evaluation of links: beyond precision / recall
Generalized precision and Generalized recall

• Instead of a binary classification into correct/
incorrect mappings, take into account how wrong
an link is:

• where r(a) is the semantic distance between
correspondence a and correspondence a’ in the
reference alignment, A is the number of
correspondences.
Laura Hollink, Mark van Assem, Shenghui
Wang, Antoine Isaac, Guus Schreiber. Two
Variations on Ontology Alignment
Evaluation: Methodological Issues.ESWC
2008.
B
C
r
A
data level
ontology / vocabulary /
schema level
Evaluation of links in PoliMedia
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
Evaluation of links in PoliMedia
Setting 1 Setting 2 Setting 3
0,48 0,62 0,8
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
Evaluation of links in PoliMedia
Setting 1 Setting 2 Setting 3
0,48 0,62 0,8
How many links did we miss?
• We ask the raters to
manually search the KB
archives for related
articles.

• Recall: 62%
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
DEMO - PoliMedia search application
Online database:
“SPARQL endpoint”
• A service to query a knowledge
base using the SPARQL query
language.

“All speeches with more
than 60 associated news
items.”
Access to Linked Open Data: how to serve and
how to consume Linked Open Data
Access to Linked Open Data: how to serve and
how to consume Linked Open Data
Access to LOD: 1. download a data dump
Access to LOD: 1. download a data dump
From server logs we know the query
-some context of the requested URIs
-variable names (?)
Access to LOD 2: follow-your-nose
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
dbp:children
"2"
lpv:speaker
dbc:Officiers_of_the_Légion_d'honneur
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
dbp:children
"2"
lpv:speaker
dbc:Officiers_of_the_Légion_d'honneur
From server logs we know the requested URI:

GET /Martin_Schulz HTTP/1.0 Accept: application/rdf+xml
Count the agenda items in which at least one MEP from
France spoke out.
Access to LOD: 3. SPARQL
SELECT (COUNT (DISTINCT ?ai) as ?count)
WHERE {
?ai rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/
plenary/AgendaItem
?ai dcterms:hasPart ?speech.
?speech lpv:speaker ?speaker.
?speaker lpv:countryOfRepresentation ?country.
?country rdfs:label ?label.
filter(?label="France"@en)
}
From server logs we know the query
-some context of the requested URIs
-variable names (?)
Access to LOD: 4. Linked Data Fragments
xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 

"GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin
HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en"
…
Access to LOD: 4. Linked Data Fragments
xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 

"GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin
HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en"
…
From server logs we know the triple patterns that were
requested
-some context of the requested URIs
-variable names (?)
What do we know about usage of Linked Open
Data?
What do we know about usage of Linked Open
Data?
1. Yearly datasets of server logs released for research purposes, 2011-2016

Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016)
USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344

2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016

Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al.

http://usewod.org/
USEWOD2011
2016
Linked Open Data query log analysis?
1. Yearly datasets of server logs released for research purposes, 2011-2016

Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016)
USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344

2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016

Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al.

http://usewod.org/
USEWOD2011
2016
Linked Open Data query log analysis?
Licensing + Anonymization:
replace all IPs with a
country code and an
identifier
What has been found so far?
• Efficient index generation [1]

• Caching [2]

• Auto-completion [3]

• Hardware scaling at peak times [4]

• modularisation of data [4]
[1] Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de
la Fuente, P. (2011). An empirical study of real-world
SPARQL queries. USEWOD2011
[2] Lorey, J., & Naumann, F. Caching and prefetching
strategies for sparql queries. USEWOD2013
[3] K. Kramer,R.Q. Dividino, and G. Gröner. SPACE:
SPARQL Index for Efficient Autocompletion. ISWC2013
(Posters & Demos)
[4] Luczak-Rösch, M., & Bischoff, M. (2011). Statistical
analysis of web of data usage. EvoDyn2011
[5] Rietveld, L., & Hoekstra, R. Man vs. Machine:
Differences in SPARQL Queries. USEWOD2014
[6] Huelss, J., & Paulheim, H. What SPARQL Query Logs
Tell and do not Tell about Semantic Relatedness in LOD.
NoISE @ ESWC 2015
Issues:
• what is the difference between queries by machines and
humans? [5]

• what is the meaning of repeated queries by tools? Bots?

• a lot of the usage is invisible due to data dump
download
[6]
Reflection: to what extend can we now answer
these questions?
How did the debate about the
financial crisis in Greece
develop?

Which political event has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Reflection: to what extend can we now answer
these questions?
How did the debate about the
financial crisis in Greece
develop?

Which political event has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Yes, but:

• what is the influence of the selection of newspapers
available at the National Library?

• what was the quality of the digitisation process (OCR)?

• How good is our linking approach (based on
automatically detected entities and topics)?

➡ How to handle these uncertainties is one of our research
questions! We call this Tool Criticism
Resources:
PoliMedia demo: http://polimedia.nl/
PoliMedia project video: https://youtu.be/u24oRCj7xrQ
Talk of Europe project: http://talkofeurope.eu/
Talk of Europe data: purl.org/linkedpolitics
Talk of Europe project video: https://youtu.be/GxA53gkCe0o
USEWOD workshop: http://usewod.org/
My website: http://homepages.cwi.nl/~hollink/
I’d be happy to answer your questions!

More Related Content

What's hot

Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
Peter Mika
 
Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
Richard Wallis
 

What's hot (20)

Clark - Metadata is the Message
Clark - Metadata is the MessageClark - Metadata is the Message
Clark - Metadata is the Message
 
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social SciencesGuest Lecture: Linked Open Data for the Humanities and Social Sciences
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Linked open data for cultural heritage
Linked open data for cultural heritageLinked open data for cultural heritage
Linked open data for cultural heritage
 
Linked library data
Linked library dataLinked library data
Linked library data
 
The Web of Data is Our Oyster
The Web of Data is Our OysterThe Web of Data is Our Oyster
The Web of Data is Our Oyster
 
Linking library data
Linking library dataLinking library data
Linking library data
 
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Linked Data for Libraries: Experiments between Cornell, Harvard and StanfordLinked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
 
LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data Strategy
 
Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library Data
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkeyYear of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkey
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
 

Similar to Linked Open Data

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
Web 3 Mark Greaves
Web 3 Mark GreavesWeb 3 Mark Greaves
Web 3 Mark Greaves
Mediabistro
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
Sören Auer
 

Similar to Linked Open Data (20)

Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Webofdata
WebofdataWebofdata
Webofdata
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015
 
Web 3 Mark Greaves
Web 3 Mark GreavesWeb 3 Mark Greaves
Web 3 Mark Greaves
 
Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2Publishing and Using Linked Open Data - Day 2
Publishing and Using Linked Open Data - Day 2
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University Library
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
Linked Open Data and Applications
Linked Open Data and Applications Linked Open Data and Applications
Linked Open Data and Applications
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic Web
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011
 

More from Laura Hollink

Presentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH projectPresentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH project
Laura Hollink
 

More from Laura Hollink (10)

Creating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU ParliamentCreating and Analysing Linked Open Data for the EU Parliament
Creating and Analysing Linked Open Data for the EU Parliament
 
Enriching Linked Open Data with distributional semantics to study concept drift
Enriching Linked Open Data with distributional semantics to study concept driftEnriching Linked Open Data with distributional semantics to study concept drift
Enriching Linked Open Data with distributional semantics to study concept drift
 
Images in Online News: demo scenario
Images in Online News: demo scenarioImages in Online News: demo scenario
Images in Online News: demo scenario
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
 
Talk of Europe: Linked data of the European Parliament
Talk of Europe:  Linked data of the European ParliamentTalk of Europe:  Linked data of the European Parliament
Talk of Europe: Linked data of the European Parliament
 
Presentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH projectPresentation at the final meeting of the MuNCH project
Presentation at the final meeting of the MuNCH project
 
Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015Talk of Europe @ DHBenelux2015
Talk of Europe @ DHBenelux2015
 
Connecting political data to media data
Connecting political data to media dataConnecting political data to media data
Connecting political data to media data
 
WWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic AnalysisWWW2013: Web Usage Mining with Semantic Analysis
WWW2013: Web Usage Mining with Semantic Analysis
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 

Linked Open Data

  • 1. Linked Open Data SIKS course on Data Science May 20, 2016 Vught. Laura Hollink
  • 2. Why do we create and use Linked Open Data? Example questions from the humanities and social sciences How did the debate about the financial crisis in Greece develop?
  • 3. Searching the proceedings of the EU Parliament "Greece" in the plenary meetings of the European Parliament Year Nr.ofmentions 050100150200 1999 2000 2001 2001 2002 2003 2004 2005 2006 2006 2007 2008 2009 2010 2010 2011 2012 2013
  • 4. Searching through newspaper archives Mentions of “Griekenland” in the Dutch newspaper the Telegraaf.
  • 5. Search volumes on a search engine Query = “Greece” http://www.google.com/trends
  • 6. Search volumes on a search engine Query = “Greece” http://www.google.com/trends We need access to data. Analysing them gives us some useful insight. But to answer the question properly we would need to combine sources and do more complex queries.
  • 7. Why do we create and use Linked Open Data? Example question 2 Which political debate in the post-war period has attracted most media attention?
  • 9. “De Indonesische Quaestie" To answer this question we need to go through all newspaper articles about all political debates. -> we need access to combined data sources, we need structured queries.
  • 10. Why do we create and use Linked Open Data?
  • 11. Why do we create and use Linked Open Data? Example question 3 What are the differences between different media? Example question 4 Has the coverage changed over time?
  • 12. Research goals and research questions Our goal is to build an infrastructure to answer these kinds of questions. 1. How do we automatically link heterogeneous datasets? 2. How do we interpret links between datasets of different quality and certainty? 3. What can we conclude from usage statistics on these datasets? 4. Can we design interfaces that allow scholars to study the datasets • including the links between them? • while assessing the reliability of the findings?
  • 13. Research goals and research questions Our goal is to build an infrastructure to answer these kinds of questions. 1. How do we automatically link heterogeneous datasets? 2. How do we interpret links between datasets of different quality and certainty? 3. What can we conclude from usage statistics on these datasets? 4. Can we design interfaces that allow scholars to study the datasets • including the links between them? • while assessing the reliability of the findings? Data Science - Big Data - Linked Open Data
  • 14. Table of Contents 1. What is Linked Open Data (LOD) 2. Creating LOD 1. How to discover links 2. How to represent links on the Web 3. How to evaluate links 3. Access to LOD (from both the server and the client perspective)
  • 15. What is Linked Open Data?
  • 16. What is Linked Open Data?
  • 17. What is Linked Open Data? A method of publishing structured data on the Web in such a way that it can be linked and queried by computers as well as humans.
  • 18. The Web of Documents
  • 19. The Web of Documents • Documents  identified  by  URIs  (html,  pdf,  images,  movies,  etc.)   • with  structured  information  for  humans  (tables,  headers)  and   • with  hyperlinks  between  them   • The  data  is  not  machine  readable,  meant  for  humans   • structure  is  implicit  (what  do  the  columns  of  a  table  mean?)   • links  are  not  typed  (what  is  the  relation  between  two  documents?)  
  • 20. The Web of Data
  • 21. The Web of Data • Everything  identified  by  URIs  (not  just  documents,  but  also  classes,   instances,  relations/links)   • The  data  is  machine  readable:     • in  formal  languages  (RDF,  RDFS,  OWL,  SKOS)     • which  enable  machines  to  do  reasoning,  i.e.  infer  new  statements   from  inserted  statements.  
  • 22. Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 23. Thing Type Population Airport Amsterdam City 1364422 Schiphol …. … …. … Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 24. Differences: • Statements can be distributed over the web • Non-unique naming assumption • Open World assumption • Everyone can say anything about anything Thing Type Population Airport Amsterdam City 1364422 Schiphol …. … …. … Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 25. Examples of URIs on the Web of Data • documents: • http://vu.nl/index.html • http://example.org/cities#Leuven • real world objects (a book in the library, a person) • isbn://5031-4444-333 • http://eyaloren.org/foaf.rdf#me • concepts: • http://cyc.org/concept/Mammal • http://cyc.org/concept/Dog • www.w3.org/2006/03/wn/wn20/instances/synset-anniversary-noun-1 • relations: • http://purl.org/linkedpolitics/vocabulary/speaker
  • 26. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples!
  • 27. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples!
  • 28. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples! RDF example in Turtle syntax: <bob#me> a foaf:Person ; foaf:knows <alice#me> ; schema:birthDate "1990-07-04"^^xsd:date ; foaf:topic_interest wd:Q12418 .
  • 29. Vocabulary definition and reasoning with RDFS B C r A data level ontology / vocabulary / schema level
  • 30. Vocabulary definition and reasoning with RDFS A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A B C r A data level ontology / vocabulary / schema level
  • 31. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF C rdfs:subClassOf B r rdf:type C THEN r rdf:type B B C r A data level ontology / vocabulary / schema level
  • 32. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent .
  • 33. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent . <bob#me> a foaf:Agent .
  • 34. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent . <bob#me> a foaf:Agent . Standard meaning
  • 35. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person .
  • 36. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person . <alice#me> rdf:type foaf:Person .
  • 37. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person . <alice#me> rdf:type foaf:Person . Standard meaning
  • 38. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/
  • 39. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant .
  • 40. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant .
  • 41. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what .
  • 42. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant
  • 43. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant
  • 44. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant.
  • 45. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean
  • 46. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean
  • 47. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean Query: :JamesDean ?what :Giant.
  • 48. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean Query: :JamesDean ?what :Giant. Answer: :playedIn
  • 49. Linked Open Data A method of publishing on the Web of Data: openly available, in RDF, with links to other datasets. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 50. Linked Open Data A method of publishing on the Web of Data: openly available, in RDF, with links to other datasets. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 51. Creating Linked Open Data in the Talk of Europe project: Discovering links, knowledge representation
  • 52. Creating Linked Open Data in the Talk of Europe project: Discovering links, knowledge representation
  • 53. The European Parliament as Linked Open Data Laura Hollink Centrum Wiskunde & Informatica, Amsterdam Astrid van Aggelen VU University Amsterdam Martijn Kleppe Erasmus University Rotterdam Henri Beunders Erasmus University Rotterdam Jill Briggeman Erasmus University Rotterdam Max Kemman University of Luxembourg
  • 54. Talk of Europe goals • To publish the entire plenary debates of the European Parliament as Linked Open Data • To improve access to the data • To enable large scale analysis across time spans. ‣To residents of the European Union access to the proceedings of the European parliament is a formal right. A. van Aggelen, L. Hollink, M. Kemman, M. Kleppe & H. Beunders. The debates of the European Parliament as Linked Open Data. Semantic Web Journal. In press, 2016.
  • 55. 1. Data in RDF
  • 56. 1. Data in RDF
  • 57. 1. Data in RDF 14M RDF statements about the 30K speeches in 23 languages by 3K speakers in 1K session days that were held in the EU parliament between 1999 and 2014
  • 58. 2. Links to external datasets •
  • 59. 2. Links to external datasets •
  • 60. 2. Links to external datasets •
  • 61. Example 1: speeches that contain a certain keyword Query: all speeches that contain the phrase “open data” …. So let us go for open data, let us go for utilisation of all the instruments available to that end! ….. …. but there too governments are encouraging the use of open data to increase transparency, accountability and citizen participation …. …. We already have many open data projects in the Member States and local authorities…..
  • 62. Example 2: speeches that contain a certain keyword by date "Slovenia" in the plenary meetings of the European Parliament Year Nr.ofmentions 020406080100 1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
  • 63. Example 2: speeches that contain a certain keyword by date "Slovenia" in the plenary meetings of the European Parliament Year Nr.ofmentions 020406080100 1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
  • 64. Example 2: speeches that contain a certain keyword by date Mentions of 'human rights' dates Frequency 0200400600800 1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
  • 65. Example 3: speeches that contain a certain keyword by country AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK Mentions of 'human rights' by country 01000200030004000500060007000
  • 66. Example 4: the number of speeches per EU country SELECT ?c (COUNT(?c) as ?count) WHERE { ?x rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/plenary/Speech>. ?x <http://purl.org/linkedpolitics/vocabulary#speaker> ?p. ?p <http://purl.org/linkedpolitics/vocabulary#countryOfRepresentation> ?c } GROUP BY ?c LIMIT 50
  • 67. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 68. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 69. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 70. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 71. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 72. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament Integrate data from the EU parliament with external datasets
  • 73. Linking Members of Parliament to Wikipedia / DBpedia
  • 74. Linking Members of Parliament to Wikipedia / DBpedia
  • 75. Linking Members of Parliament to Wikipedia / DBpedia
  • 76. Linking Members of Parliament to Wikipedia / DBpedia • String matching is the most important feature in the linking process. • “nearly all [alignment systems] use a string similarity metric” [12] • stopping and stemming is not helpful! Nor is using WordNet synonyms. [12] [12] Cheatham, M., & Hitzler, P. String similarity metrics for ontology alignment. ISWC 2013. http://www.dbpedia.org/resource/Judith_Sargentini
  • 77. Linking Members of Parliament to Wikipedia / DBpedia • String matching is the most important feature in the linking process. • “nearly all [alignment systems] use a string similarity metric” [12] • stopping and stemming is not helpful! Nor is using WordNet synonyms. [12] [12] Cheatham, M., & Hitzler, P. String similarity metrics for ontology alignment. ISWC 2013. http://www.dbpedia.org/resource/Judith_Sargentini
  • 78. How to relate a speech to a speaker and party? lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 79. How to relate a speech to a speaker and party? Why is this not a good solution? lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 80. How to relate a speech to a speaker and party? Why is this not a good solution? 1. A person might be a member of more than one party (at different times) lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 81. How to relate a speech to a speaker and party? Why is this not a good solution? 1. A person might be a member of more than one party (at different times) 2. Since there is no link between a speech and a party, queries for all speeches spoken by the members of a certain party become very complicated. lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 82. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker
  • 83. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type
  • 84. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:spokenAs lpv:speaker lpv:spokenAs rdf:type
  • 85. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:spokenAs lpv:speaker lpv:spokenAs rdf:type Note: this is a common “design pattern” referred to as n-ary relations or relations as classes
  • 86. Intermezzo: one-question Quiz Reasoning on the Web of Data Question: What can we conclude from this graph? A. Stihler is a member of exactly 3 parties B. Stihler is a member of at least 3 parties C. Stihler is a member of at most 3 parties D. None of the above E. All of the above F. Other, namely …. http://purl.org/linkedpolitics/EUmember_4545 "Catherine Stihler"foaf:name http://purl.org/linkedpolitics/EUParty/PES http://dbpedia.org/resource/ Party_of_European_Socialists http://dbpedia.org/resource/ Progressive_Alliance_of_Socialists_and_Democrats :memberOf :memberOf :memberOf
  • 87. Creating Linked Open Data in the PoliMedia project: Discovering links, knowledge representation, evaluation
  • 88. Creating Linked Open Data in the PoliMedia project: Discovering links, knowledge representation, evaluation
  • 90.
  • 91. Which political debate in the post-war period has attracted most media attention? What are the differences between different media? Has the coverage changed over time?
  • 92. Transcriptions of all 9,294 meetings of the Dutch parliament between 1945-1995, consisting of 1,208,903 speeches. Roughly 1.8 Million news bulletins between 1937-1984 (We only use 1945-1995) Archives of hundreds of newspaper with tons of newspaper issues or 10’s of Millions of articles between 1618-1995. (We only use 1945-1995) Transcriptions of all meetings of the European Parliament between 1999 and 2014.
  • 93.
  • 94. Links in PoliMedia is about • 3 Million links
  • 95. Discovering links between politics and news Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate
  • 96. Step 2: generate links Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate Intuition 1: The name of the speaker should appear in the article and the article should be published within a week of the debate
  • 97. Step 2: generate links Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate Intuition 1: The name of the speaker should appear in the article and the article should be published within a week of the debate Intuition 2: the more the article and the speech overlap in terms of topics and named entities, the more they are related.
  • 99. Representation of links • Note: this is another example of the“design pattern” referred to as n-ary relations or relations as classes! • It allows us to save provenance information about the statements we create. :speech123:newsArticle456 :isAbout
  • 100. Representation of links • Note: this is another example of the“design pattern” referred to as n-ary relations or relations as classes! • It allows us to save provenance information about the statements we create. :speech123:newsArticle456 :isAbout :speech123 :newsArticle456 :link001 01-02-2013 :PoliMedia_Linking_Engine :quotes :concept1 :concept2 link type :madeBy:creationDate
  • 102. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall
  • 103. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall 2. Comparison to a reference linkset • precision and recall • used in OAEI on the SEALS platform • more expensive if a reference alignment has to be created (but: crowd sourcing!)
  • 104. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall 2. Comparison to a reference linkset • precision and recall • used in OAEI on the SEALS platform • more expensive if a reference alignment has to be created (but: crowd sourcing!) 3. End-to-end evaluation (a.k.a. evaluating an application that uses the mappings) • arguably the best method! • need to have access to an application + users
  • 105. Evaluation of links: beyond precision / recall B C r A data level ontology / vocabulary / schema level
  • 106. Evaluation of links: beyond precision / recall Generalized precision and Generalized recall • Instead of a binary classification into correct/ incorrect mappings, take into account how wrong an link is: • where r(a) is the semantic distance between correspondence a and correspondence a’ in the reference alignment, A is the number of correspondences. Laura Hollink, Mark van Assem, Shenghui Wang, Antoine Isaac, Guus Schreiber. Two Variations on Ontology Alignment Evaluation: Methodological Issues.ESWC 2008. B C r A data level ontology / vocabulary / schema level
  • 107. Evaluation of links in PoliMedia How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 108. Evaluation of links in PoliMedia Setting 1 Setting 2 Setting 3 0,48 0,62 0,8 How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 109. Evaluation of links in PoliMedia Setting 1 Setting 2 Setting 3 0,48 0,62 0,8 How many links did we miss? • We ask the raters to manually search the KB archives for related articles. • Recall: 62% How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 110. DEMO - PoliMedia search application
  • 111.
  • 112.
  • 113.
  • 114.
  • 115.
  • 116. Online database: “SPARQL endpoint” • A service to query a knowledge base using the SPARQL query language. “All speeches with more than 60 associated news items.”
  • 117. Access to Linked Open Data: how to serve and how to consume Linked Open Data
  • 118. Access to Linked Open Data: how to serve and how to consume Linked Open Data
  • 119. Access to LOD: 1. download a data dump
  • 120. Access to LOD: 1. download a data dump From server logs we know the query -some context of the requested URIs -variable names (?)
  • 121. Access to LOD 2: follow-your-nose
  • 122. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6
  • 123. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart
  • 124. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart
  • 125. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz
  • 126. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz
  • 127. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz dbp:children "2" lpv:speaker dbc:Officiers_of_the_Légion_d'honneur
  • 128. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz dbp:children "2" lpv:speaker dbc:Officiers_of_the_Légion_d'honneur From server logs we know the requested URI: GET /Martin_Schulz HTTP/1.0 Accept: application/rdf+xml
  • 129. Count the agenda items in which at least one MEP from France spoke out. Access to LOD: 3. SPARQL SELECT (COUNT (DISTINCT ?ai) as ?count) WHERE { ?ai rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/ plenary/AgendaItem ?ai dcterms:hasPart ?speech. ?speech lpv:speaker ?speaker. ?speaker lpv:countryOfRepresentation ?country. ?country rdfs:label ?label. filter(?label="France"@en) }
  • 130.
  • 131. From server logs we know the query -some context of the requested URIs -variable names (?)
  • 132.
  • 133.
  • 134.
  • 135. Access to LOD: 4. Linked Data Fragments xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 
 "GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en" …
  • 136. Access to LOD: 4. Linked Data Fragments xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 
 "GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en" … From server logs we know the triple patterns that were requested -some context of the requested URIs -variable names (?)
  • 137. What do we know about usage of Linked Open Data?
  • 138. What do we know about usage of Linked Open Data?
  • 139. 1. Yearly datasets of server logs released for research purposes, 2011-2016 Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016) USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344 2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016 Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al. http://usewod.org/ USEWOD2011 2016 Linked Open Data query log analysis?
  • 140. 1. Yearly datasets of server logs released for research purposes, 2011-2016 Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016) USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344 2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016 Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al. http://usewod.org/ USEWOD2011 2016 Linked Open Data query log analysis? Licensing + Anonymization: replace all IPs with a country code and an identifier
  • 141. What has been found so far? • Efficient index generation [1] • Caching [2] • Auto-completion [3] • Hardware scaling at peak times [4] • modularisation of data [4] [1] Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de la Fuente, P. (2011). An empirical study of real-world SPARQL queries. USEWOD2011 [2] Lorey, J., & Naumann, F. Caching and prefetching strategies for sparql queries. USEWOD2013 [3] K. Kramer,R.Q. Dividino, and G. Gröner. SPACE: SPARQL Index for Efficient Autocompletion. ISWC2013 (Posters & Demos) [4] Luczak-Rösch, M., & Bischoff, M. (2011). Statistical analysis of web of data usage. EvoDyn2011 [5] Rietveld, L., & Hoekstra, R. Man vs. Machine: Differences in SPARQL Queries. USEWOD2014 [6] Huelss, J., & Paulheim, H. What SPARQL Query Logs Tell and do not Tell about Semantic Relatedness in LOD. NoISE @ ESWC 2015 Issues: • what is the difference between queries by machines and humans? [5] • what is the meaning of repeated queries by tools? Bots? • a lot of the usage is invisible due to data dump download [6]
  • 142. Reflection: to what extend can we now answer these questions? How did the debate about the financial crisis in Greece develop? Which political event has attracted most media attention? What are the differences between different media? Has the coverage changed over time?
  • 143. Reflection: to what extend can we now answer these questions? How did the debate about the financial crisis in Greece develop? Which political event has attracted most media attention? What are the differences between different media? Has the coverage changed over time? Yes, but: • what is the influence of the selection of newspapers available at the National Library? • what was the quality of the digitisation process (OCR)? • How good is our linking approach (based on automatically detected entities and topics)? ➡ How to handle these uncertainties is one of our research questions! We call this Tool Criticism
  • 144. Resources: PoliMedia demo: http://polimedia.nl/ PoliMedia project video: https://youtu.be/u24oRCj7xrQ Talk of Europe project: http://talkofeurope.eu/ Talk of Europe data: purl.org/linkedpolitics Talk of Europe project video: https://youtu.be/GxA53gkCe0o USEWOD workshop: http://usewod.org/ My website: http://homepages.cwi.nl/~hollink/ I’d be happy to answer your questions!