SlideShare ist ein Scribd-Unternehmen logo
1 von 163
Downloaden Sie, um offline zu lesen
Linked Data for the
Humanities: methods and
techniques
Enrico Daga
The Open University
Aldo Gangemi
Università di Bologna
Tutorial @ DH2019, Utrecht, 8th July
Albert Meroño-Peñuela
 Vrije Universiteit Amsterdam
Special Guest
14.00 Session I
• Linked Data in a nutshell
• Producing Linked Data
15.30 (Coffee break)
16.00 Session II
• Consuming Linked Data
• Hybrid Methods
Welcome
Linked Data in a nutshell
Intro
A bit of history
Invented the web in 1989
(yeah!)
Invented the semantic
web in 1994 (duh?)
“To a computer, then, the web is a flat,
boring world devoid of meaning”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“This is a pity, as in fact documents on the
web describe real objects and imaginary
concepts, and give particular relationships
between them”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“Adding semantics to the web involves two things:
allowing documents which have information in
machine-readable forms, and allowing links to be
created with relationship values.”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“The Semantic Web is not a separate Web but an
extension of the current one, in which information
is given well-defined meaning, better enabling
computers and people to work in cooperation.”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
Linked Data is a way of publishing structured information
that allows datasets to be connected and enriched by the
means of links among their entities.
• LD uses the World Wide Web as publishing platform
• Based on W3C standards - open to everyone
• Enables your data to refer to other data
• … and other data to refer to yours!
Linked Data in a nutshell
h"ps://en.wikipedia.org/wiki/Linked_data
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Linked Open Data in 2007
2008
https://lod-cloud.net/
2009
https://lod-cloud.net/
Linked Data: The story so far (2009)
2010
https://lod-cloud.net/
2011
https://lod-cloud.net/
http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/
(crawlable) 2014
2017
https://lod-cloud.net/
03/2019
https://lod-cloud.net/
https://lod-cloud.net/
Building blocks
The very basics
• A	hierarchy	of	languages	
• Each	layer	exploits	and	uses	
capabilities	of	the	layers	below
The W3C “Layer Cake”
• A principle: hypertext
• A protocol: HTTP
• An identification scheme: URNs/URIs
• A language: HTML
The traditional Web
• A principle: hypertext
• A protocol: HTTP
• An identification scheme: URNs/URIs
• A language: HTML RDF
The semantic Web
• Uniform Resource Identifiers (URIs)
• To identify things
• HyperText Transfer Protocol (HTTP)
• To access data about them
• Resource Description Framework (RDF)
• a meta-model for data representation.
• it does not specify a particular schema
• offers a structure for representing schemas and data
• SPARQL Protocol and Query Language (SPARQL)
• To query LD databases directly on the Web
Linked Data Technology Stack
• A Uniform Resource Identifier (URI) is a compact sequence of
characters that identifies an abstract or physical resource.
[RFC3986]
• Syntax
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
• Example
foo://example.com:8042/over/there?name=ferret#nose
_/ _________________/_________/ __________/ __/
| | | | |
scheme authority path query fragment
HTTP URIs
• URIs (Unique Resource Identifiers) are used to identify things (also
called entities) in the real world
• For instance: people, places, events, companies, products, movies, etc.
A Web of Things
HTTP
Simplest thing ever
• On top of
• The Internet Protocol (IPv4)
• Domain Name System (DNS): e.g. dbpedia.org
• A Client / Server protocol: Request -> Response
• Message structure: Headers + Body (content)
Resource Description Framework
Rela%onships	between things are expressed by the means of
a multi-directed, fully labeled graph		
where
nodes	could be resources or XMLSchema-typed values;
rela%onships	are also identified by URIs
The RDF model
(the “content” of the HTTP body…)
RDF is based on an atomic element: the triple.
Triple: (subject predicate object)
- subject: a URI or a blank node
- predicate: MUST be a URI
- object: a URI, a blank node, or a literal
The RDF Triple
•
Example RDF Graph
0341Leipzig
hasAreaCode
Burkhard	Jung
hasMayor
Saxony
locatedIn
51.3333
latitude
12.3833
longitude
Germany
Social	Democratic	Party
1958-03-07 isMemberOf
locatedIn
born
isMayorOf
• Representation of data values
• Serialization as strings
• Interpretation based on the datatype
• Literals without Datatype are treated as strings
• and can be annotated with a language (Alpha-2): @en
Literals
Leipzig
Burkhard	Jung
51.3333latitude
12.3833
longitude
1958-03-07
born
isMayorOf hasMayor
• N-Triples (application/n-triples)
• Turtle (text/turtle)
• RDF/XML (application/rdf+xml)
• N-Quads (application/n-quads)
• TriG (application/trig)
• […]
RDF serializations
N-Triples
https://www.w3.org/TR/n-triples/
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/deathPlace>
<http://dbpedia.org/resource/Vienna> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/birthDate>
"1756-1-27"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/deathDate>
"1791-12-5"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/birthPlace>
<http://dbpedia.org/resource/Salzburg> .
• Namespaces in XML: https://www.w3.org/TR/xml-names/
• Namespaces end either with # or /
• In serialisations, are mapped to prefixes, for brevity
• http://prefix.cc to get help with namespaces and common
prefixes
• http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
• http://dbpedia.org/resource/
• dbr:Wolfgang_Amadeus_Mozart
Namespaces
Turtle
https://www.w3.org/TR/turtle/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix yago: <http://dbpedia.org/class/yago/> .
@prefix wikidata: <http://www.wikidata.org/entity/> .
dbr:Wolfgang_Amadeus_Mozart
rdfs:label "Wolfgang Amadeus Mozart” ;
rdf:type owl:Thing , yago:WikicatGermanClassicalComposers ,
yago:WikicatGermanComposers , dbo:Person .
dbr:Wolfgang_Amadeus_Mozart owl:sameAs wikidata:Q254 ;
dbo:deathPlace dbr:Vienna .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbr:Wolfgang_Amadeus_Mozart dbo:deathDate "1791-12-5"^^xsd:date ;
dbo:birthPlace dbr:Salzburg ;
dbo:birthDate "1756-1-27"^^xsd:date .
• SPARQL: https://www.w3.org/TR/sparql11-overview/
• OWL: https://www.w3.org/OWL/
• RIF: https://www.w3.org/TR/rif-overview/
• SHACL: https://www.w3.org/TR/shacl/
• SPIN: https://spinrdf.org/
• RDFa: https://www.w3.org/TR/rdfa-syntax/
• JSON-LD: https://json-ld.org/
Many languages
• Triple Stores: database management systems that allow to
query RDF
• RDF1.1 named graphs allow to integrate multiple RDF
documents preserving the context of each triple: g s p o
• Syntax: N-Quads
Named Graphs
1. Use URIs to identify the “things” in your data
2. Use h2p://	URIs so people (and machines) can look them
up on the web
3. When a URI is looked	up, request/return	a descrip%on	of
the thing in RDF
4. Include links	to	related	things	(e.g.	owl:sameAs)
Linked Data principles
Something very basic
http://www.w3.org/DesignIssues/LinkedData.html
Linked Data example
23
Follow your nose
Hands-On
• Understand URI resolution
• Grasp Content-Negotiation
• Experience graph traversal
• https://ld4humanities.github.io/ > Hands-On resources
Objectives
HTTP
Simplest thing ever
• On top of
• The Internet Protocol (IPv4)
• Domain Name System (DNS): e.g. dbpedia.org
• A Client / Server protocol: Request -> Response
• Message structure: Headers + Body
https://www.slideshare.net/randyconnolly/chapter01-presentation-16514220
Headers
• Vary between Request and Response
(two newlines)
Body
• Any data
HTTP
Message structure
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
Access-Control-Allow-Origin: *
…
REQUESTRESPONSE
cURL is a command line tool and library for transferring data
with URLs
wURL is a simple web app that allows non-Unix users to use
cURL from a Web browser
http://purl.org/ld4dh/wurl
https://curl.haxx.se/
… let’s try …
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
Access-Control-Allow-Origin: *
…
REQUESTRESPONSE
GET /page/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
curl -v http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
HTTP/1.1 200 OK
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
[…]
<html> […]
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: text/turtle
HTTP
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle”
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
GET /data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: text/turtle
HTTP
curl -v http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
HTTP/1.1 200 OK
Content-Type: text/turtle; charset=UTF-8
Content-Length: 50708
[…]
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .
dbr:Amadeus_Mozart dbo:wikiPageRedirects dbr:Wolfgang_Ama
dbr:The_Story_of_Mozart dbo:wikiPageRedirects dbr:Wolfga
dbr:Mozartian dbo:wikiPageRedirects dbr:Wolfgang_Amadeus_
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
curl -v “http://dbpedia.org/page/Wolfgang_Amadeus_Mozart"
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle”
curl -v “http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl”
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle” -L
• In what formats is Mozart available?
• text/html, application/rdf+xml, text/n-triples, text/turtle
• Find Mozart's image (it’s a jpg)
• When Mozart was born?
• Where Mozart died?
• How many inhabitants has the city today?
How many Mozart?
http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
• Find the location of the experience
• When did it happened?
• Who is the listener?
• What musical opera was performed?
• What is the author of the listened music?
• What is the performer?
• What is the genre
• Find information about this genre
• Can you find other operas of the same genre?
a Listening Experience
http://data.open.ac.uk/led/lexp/1446304716352
This type of task is possible using a SPARQL endpoint:
http://dbpedia.org/sparql
A scent of SPARQL
“Find other operas of the same genre”
SELECT * WHERE {
?entity
<http://purl.org/dc/terms/subject>
<http://dbpedia.org/resource/Category:Grand_operas> .
}
1. https://www.theguardian.com/this-page-does-not-exists
2. urn:issn:23346587
3. issn:23346587
4. mailto:enrico.motta@open.ac.uk
5. dbr:Music
Quiz
Which of the following are not valid RDF IRIs?
Producing Linked Data
1. Knowledge representation
1. Identify the source
2. Understand the content (domain)
3. Modelling: reuse or build an ontology
2. Produce RDF
1. Populate the ontology
2. Encode or (re)engineer in RDF - “triplification”
3. Put it on the Web and provide services to access and query the data
1. Support URI dereferencing (Content negotiation)
2. Expose a SPARQL Endpoint
3. Describe your dataset with Linked Data (ehm …start over)
So you want to do Linked Data?
• World’s academic communities has been dealing for
years with knowledge	representa%on	
• Ar%ficial	intelligence, natural language processing, model
management, and many other research fields largely
contributed
• Some ancestors	traced the way
How to represent knowledge?
EXAMPLE
• Instances are associated with one or several
classes:
Boddingtons rdf:type Ale .
Grafentrunkrdf:type Bock .
Hoegaarden rdf:type White .
Jever rdf:type Pilsner .
Ontologies
different levels of detail & complexity
Complexity
Types
Labels
Descriptions
Comments
Class
Hierarchies
Relations
Documented
meaning
Basic Logic
Rules
Inferences
Transitivity
Domain
Range
Rules
Description Logic
Reasoning
Class unions
Sets semantics
Intersections
Disjointness
[…]
light-weight heavy-weight
Copyright	IKS	Consortium	
• A vocabulary for describing properties and classes of RDF
resources
• rdfs:Resource
• rdf:type
• rdfs:Class
• rdf:Property
• rdfs:subClassOf
• rdfs:subPropertyOf
• rdfs:domain
• rdfs:range
RDF Schema
http://www.w3.org/TR/rdf-schema/
• OWL allows to specify other axioms
• Property cardinality	restric%ons		
• Classes disjunc%on	
• Property transi%vity		
• Cardinality	constraints
• But beware: more	expressivity	means more reasoning	
complexity
The Web Ontology Language (OWL)
formal language for automated reasoning
The Web Ontology Language (OWL)
formal language for automated reasoning
:Novel rdf:type owl:Class.
:Short_Story rdf:type owl:Class.
:Poetry rdf:type owl:Class.
:Literature rdf:type owl:Class;
owl:unionOf (:Novel :Short_Story :Poetry).
<myWork> rdf:type :Novel .
<myWork> rdf:type :Literature .
IF
THEN
http://ontologydesignpatterns.org
• Schema layer of RDF
• Defines terms (classes and properties)
• Typically RDFS or OWL family
• Reusability is important for supporting interoperability
• Common vocabularies: Dublin Core, SKOS, FOAF, SIOC,
vCard, DOAP, Core Organization Ontology, VoID
Vocabularies
light-weight semantics
http://www.slideshare.net/prototypo/introduction-to-linked-data-rdf-vocabularies
!69
Vocabulary: Friend-of-a-Friend (FOAF)
defines classes and properties for representing

information about people and their

relationships
Soeren rdf:type foaf:Person .
Soeren currentProject http://OntoWiki.net .
Soeren foaf:homepage http://aksw.org/Soeren .
Soeren foaf:knows http://sembase.at/Tassilo .
Soeren foaf:sha1 09ac456515dee .
!70
Vocabulary: Semantically

Interlinked Online Communities.
Represent content from Blogs, Wikis, Forums, Mailinglists, Chats
etc.
!71
Vocabulary: Simple Knowledge Organization
System (SKOS)
support the use of thesauri, classification schemes, subject

heading systems and taxonomies
• DBpedia	Ontology	Schema:	
• manually	created	for	DBpedia	(infoboxes)	
• 1140	classes	+	1149	object	properties	+	1741	datatype	properties;	>7K	axioms	(1537	on	C,	2676	on	
OP,	3264	on	DTP:	1.3,	2.3,	1.8	ratios);		
• (200M	triples	in	DBpedia)	
• YAGO:	
• large	hierarchy	linking	Wikipedia	leaf	categories	to	WordNet	
• 250,000	classes	
• UMBEL	(Upper	Mapping	and	Binding	Exchange	Layer):	
• 20000	classes	derived	from	OpenCyc	
• DOLCE-Zero	(Foundational	Ontology,	aligned	to	DBpedia):	
• 76	classes	+	105	object	properties	+	5	datatype	properties;	596	axioms	(196	on	C,	389	on	OP,	11	on	
DTP:	2.4,	3.7,	2.2	ratios)	
• presence	of	“restrictions”,	top-level	disjointness,	and	patterns	
• Wikipedia	Categories:	
• Not	a	class	hierarchy	(e.g.	cycles),	represented	using	SKOS	
• 415,000+	categories
2011/05/12
General	Purpose	Ontologies
(different levels of detail & complexity)
Domain Ontologies
(different levels of detail & complexity)
https://lov.linkeddata.es/dataset/lov/
1. From a Relational Database
2. From Web content (Scraping)
3. From XML or other structured data formats
4. From a data table (e.g. a CSV file)
5. From natural language (Sic!)
How to produce RDF?
• W3C R2RML - language to specify
mappings between SQL databases and
RDF: http://www.w3.org/TR/r2rml/
• D2RQ - allows to access relational
databases as virtual graphs: http://d2rq.org/
• DB2Triples - runs a specified R2RML file
and generates RDF: https://github.com/
antidot/db2triples
1. From a relational database
http://data.cnr.it/data/cnr/individuo/CNR
• RDFa and microformats are used to embed semantic
information (expressed using the RDF model) into regular
HTML pages
• RDFa does it using existing (rel) and additional
(about, property, typeof) attributes
• Microformats only use usual HTML attributes (class)
• To extract, e.g., Apache any23: https://any23.apache.org
2. From Web pages
DBpedia is the de-facto Hub of LOD.
• descrip%ons	of	ca.	3.4	million	things	(1.5 million classified in a consistent ontology,
including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films,
15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases
• labels and abstracts for these 3.2 million things in up to 92 different languages;
1,460,000 links to images and 5,543,000 links to external web pages;

4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories,
and 75,000 YAGO categories
• altogether over	1	billion	pieces	of	informa%on	(i.e. RDF triples): 257M from English
edition, 766M from other language editions
• DBpedia	Live	(http://live.dbpedia.org/sparql/) &

Mappings	Wiki	(http://mappings.dbpedia.org)

integrate the community into a refinement cycle
Extracting structured information from Wikipedia and make this
information available on the Web as LOD:
• link other data sets on the Web to Wikipedia data (encyclopaedic
knowledge)
• ask sophisticated queries against Wikipedia (e.g. universities in
Paris, mayors of towns in a certain region),
• Represents a community consensus
Transforming Wikipedia into a Knowledge Base
Structure in Wikipedia
• Title	
• Abstract	
• Infoboxes	
• Geo-coordinates	
• Categories	
• Images	
• Links	
– other	language	versions	
– other	Wikipedia	pages	
– To	the	Web	
– Redirects	
– Disambiguations
Infobox	templates
{{Infobox Korean settlement
| title = Busan Metropolitan City
| img = Busan.jpg
| imgcaption = A view of the [[Geumjeong]] district in Busan
| hangul = 부산 광역시
...
| area_km2 = 763.46
| pop = 3635389
| popyear = 2006
| mayor = Hur Nam-sik
| divs = 15 wards (Gu), 1 county (Gun)
| region = [[Yeongnam]]
| dialect = [[Gyeongsang]]
}}
http://dbpedia.org/resource/Busan
dbp:Busan dbpp:title ″Busan Metropolitan City″
dbp:Busan dbpp:hangul ″부산 광역시″@Hang
dbp:Busan dbpp:area_km2 ″763.46“^xsd:float
dbp:Busan dbpp:pop ″3635389“^xsd:int
dbp:Busan dbpp:region dbp:Yeongnam
dbp:Busan dbpp:dialect dbp:Gyeongsang
...
Wikitext-Syntax
RDF	representation
2011/05/12
83
DBpedia	SPARQL	Endpoint
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbc: <http://dbpedia.org/resource/Category:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?birth ?description ?person WHERE {
?person dbo:birthPlace dbr:Berlin .
?person dct:subject dbc:German_musicians .
?person dbo:birthDate ?birth .
?person foaf:name ?name .
?person rdfs:comment ?description .
FILTER (LANG(?description) = 'en') .
} ORDER BY ?name
2011/05/12
• hosted on a OpenLink Virtuoso server
• can answer SPARQL queries like
• Give me all Sitcoms that are set in NYC?
• All tennis players from Moscow?
• All films by Quentin Tarentino?
• All German musicians that were born in Berlin in the 19th
century?
• All soccer players with tricot number 11, playing for a club having
a stadium with over 40,000 seats and is born in a country with
over 10 million inhabitants?
DBpedia	SPARQL	Endpoint
http://dbpedia.org/sparql
• Two steps:
• Remodelling task
• Reengineering task
• Web APIs
• JSON: annotate with JSON-LD https://json-ld.org/
• XML
• XML != RDF
• XML serialisation of DOM (tree), RDF is a graph instead, no root.
• eXtensible Stylesheet Language Transformations (XSLT) to
generate a RDF format, e.g. N-Triples
3. From Web APIs, XML or other formats
• data.open.ac.uk is the home of The Open University LOD
• 2010, OU first university in the UK to publish LOD.
• Collects and interlinks open data from institutional
repositories of the University, and makes it available as LD
data.open.ac.uk
Open Educational Resources
• Metadata about educational resources produced
or co-produced by The Open University
• OU/BBC Coproductions | OU podcasts |
OpenLearn | Videofinder
Scientific Production
• Metadata about scientific production of The
Open University
• Open Research Online (http://
oro.open.ac.uk/)
Social Media
• Content hosted by social media web sites.
• Metadata are extracted from public APIs and
aggregated into RDF.
• Audioboo | YouTube
Datasets
http://data.open.ac.uk
Organisational
• Data collected form internal repositories and first
made public as linked data.
• The OU's Key Information Set from Unistats |
OU People Profiles | KMi People Profiles | Open
University data XCRI-CAP 1.2 | Qualifications |
Courses | OU Planet Stories
Data from Research Projects
• Linked Data from research projects.
• Arts and Humanities Research Council project
metadata | The Listening Experience Database |
The UK Reading Experience Database | The
Reading Experience Database: DBpedia
alignments
• Two tasks: remodelling & reengineering
• Homemade recipe:
1. Find your identifier(s), establish namespaces
2. Map columns to predicates, establish cell value type
(URI or Literal)
3. Iterate over the rows
4. Generate a triple for each cell
4. From a data table
• A Google Form Spreadsheet
• Prepare column names (first row)
• Identify the Subject column (S)
• Generate a tuple for each column value (S, c, v) - G SQL
• Clean: remove tuples with empty values
• Format tuples into valid N3 triples
Example
(only reengineering)
https://docs.google.com/spreadsheets/d/
1j_LHZIOhkbD61r7fSxuf4017tgbOoL_Z6tLT0oDQz_0/edit?usp=sharing
1. Load the data into a Triple Store
• Virtuoso Open Source: virtuoso.openlinksw.com
• Apache Jena: http://jena.apache.org/
• Blazegraph: www.blazegraph.com
• https://en.wikipedia.org/wiki/Comparison_of_triplestores
2. Publish the SPARQL Endpoint
3. Setup content negotiation
• http://www.example.com/…
303 to SPARQL DESCRIBE <http://www.example.com/...>
How to publish on the Web?
(signposting only here)
Coffee break
See you at 4pm (sharp!)
Consuming Linked Data
SPARQL
• Understand triple patterns
• Try with some features of the language
• https://ld4humanities.github.io/ > Hands-On resources
Objectives
SPARQL
SPARQL Protocol And RDF Query Language
Triple and Graph Patterns
How do we describe the structure of the RDF graph
which we're interested in?
# An RDF triple in Turtle syntax
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
# A SPARQL triple pattern, with a single variable
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
# All parts of a triple pattern can be variables
?subject foaf:name ?name.
# Matching labels of resources
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
?subject rdfs:label ?label.
# Combine triples patterns to create a graph pattern
PREFIX dby: <http://dbpedia.org/class/yago/>
?subject rdfs:label ?label .
?subject rdf:type dby:WikicatOperaComposers .
# SPARQL is based on Turtle, which allows abbreviations
# e.g. predicate-object lists:
?subject rdfs:label ?label;
rdf:type dby:WikicatOperaComposers .
# Graph patterns allow us to traverse a graph
?person rdfs:label “Wolfgang Amadeus Mozart”@de .
?person dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
#Graph patterns allow us to traverse a graph
?person rdfs:label “Wolfgang Amadeus Mozart”@de .
?person dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
Structure of a Query
What does a basic SPARQL query look like?
# Query. 1
# Associate URIs with prefixes
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# Example of a SELECT query, retrieving 2 variables
# Variables selected MUST be bound in graph pattern
SELECT ?person ?label
WHERE {
#This is our graph pattern
?person rdfs:label “Wolfgang Amadeus Mozart”@de ;
dbo:deathPlace ?place .
?place dbo:populationTotal ?population
}
• https://ld4humanities.github.io/ > Hands-On resources
• We will use this UI: http://yasgui.org/
• Credits:
Let’s try it out
http://about.yasgui.org/
http://laurensrietveld.nl/
# Query. 2
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# Example of a SELECT query, retrieving all variables
SELECT *
WHERE {
?person rdfs:label “Wolfgang Amadeus Mozart”@de ;
dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
}
OPTIONAL bindings
How do we allow for missing or unknown
information?
# Query. 3
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?name ?image
WHERE {
#This pattern must be bound
?person rdfs:label "Wolfgang Amadeus Mozart"@de ;
dbo:birthPlace ?place .
#Anything in this block doesn't have to be bound
OPTIONAL {
?place dbo:populationTotal ?population .
}
}
UNION queries
How do we allow for alternatives or variations in the
graph?
# Query. 4
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?person ?place
WHERE {
{
?person dbo:deathPlace ?place .
}
UNION
{
?person dbo:birthPlace ?place .
}
}
Sorting & Restrictions
How do we apply a sort order to the results?
How can we add restrictions?
How can we restrict the number of results returned?
# Query. 5
# Select the URI and population of all places
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
}
# Ex. 6
# Select the URI and population of all places
# with highest first
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
}
# Use an ORDER BY clause to apply a sort.
# Can be ASC or DESC
ORDER BY DESC(?population)
# Ex. 7
# Select the URI and population of a city
# with highest first
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
FILTER EXISTS {
?place dbp:countryCode []
}
}
# Use an ORDER BY clause to apply a sort.
# Can be ASC or DESC
ORDER BY DESC(?population)
# Ex. 8
# Select the URI and population of the 11-20th most
populated countries
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
FILTER EXISTS {
?place dbp:countryCode []
}
}
# Use an ORDER BY clause to apply a sort.
ORDER BY DESC(?population)
# Limit to first ten results
LIMIT 10
# Apply an offset to get next “page”
OFFSET 10
Filtering
How do we restrict results based on aspects of the
data rather than the graph, e.g. string matching?
# In the following triple the literal has assigned a
# datatype to indicate it is a date
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
dbr:Wolfgang_Amadeus_Mozart
dbo:birthDate "1756-1-27"^^xsd:date
# Query. 9
# Select name of persons born between 1st Jan 1756 and
1st Jan 1757
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?name
WHERE {
?person dbo:birthDate ?date;
foaf:name ?name.
FILTER (?date > "1756-01-01"^^xsd:date &&
?date < "1757-01-01"^^xsd:date)
}
# Query. 10
# Select the URI and population of places with an area
below 20km^2, with most populated first
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbpp: <http://dbpedia.org/ontology/PopulatedPlace/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population ;
dbpp:areaTotal ?area .
# Note that we have to cast the data to the right type
# As it is not declared in the data
FILTER( xsd:double(?area) < 20 )
}
ORDER BY DESC(?population)
# Query. 11
# Select persons named Wolfgang
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?subject ?name
WHERE {
?subject foaf:name ?name ;
dbo:deathPlace dbr:Vienna .
FILTER( regex(?name, "Wolfgang", "i" ) )
}
• Logical: !, &&, ||
• Math: +, -, *, /
• Comparison: =, !=, >, <, ...
• Variable tests: isURI, isBlank, isLiteral, bound
• Accessors: str, lang, datatype
• Other: sameTerm, langMatches, regex
Built-In Filters
DISTINCT
How do we remove duplicate results?
# Query. 12
# Select list of places that gave birth to german
classical composers
PREFIX space: <http://purl.org/net/schemas/space/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?place
WHERE {
[] dbo:birthPlace ?place ;
dct:subject dbc:German_classical_composers
}
SPARQL Query Forms
Does SPARQL do more than just SELECT data?
ASK
Test whether the graph contains some data of
interest
# Query. 13
# Is Mozart’s date of birth 1756-1-27?
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ASK WHERE {
dbr:Mozart space:launched "1756-1-27"^^xsd:date .
}
# ASK returns a boolean value
DESCRIBE
Generate an RDF description of a resource(s)
# Query. 14
# Describe persons born in 1757
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbp: <http://dbpedia.org/property/>
DESCRIBE ?person {
?person dbp:birthDate ?date .
FILTER ( ?date < "1958-01-01"^^xsd:date &&
?date >= "1757-01-01"^^xsd:date )
}
CONSTRUCT
Create a custom RDF graph based on query criteria
Can be used to transform RDF data
@prefix exp: <http://www.example.com/property#> .
<mailto:albert.meronyo@gmail.com>
exp:timestamp "05/06/2019 14:38:16" ;
exp:xml_level "Basic" ;
exp:rdf_level "Expert" ;
exp:sparql_level "Expert" ;
exp:web_level "Expert" ;
exp:ontology_level "Expert" ;
exp:ld_publishing_level "Expert" ;
exp:ld_consumption "Expert" ;
exp:ld_ui_level "Expert" ;
exp:interested_in "Advanced Ontology Design, Linked
Data Production, Linked Data Consumption, Success
stories, Advanced Linked Data Techniques, Advanced
SPARQL" ;
exp:known_projects "JazzCats, LinkezBrainz, MIDI
Linked Data cloud, LOD Laundromat" ;
exp:plans_of_use "Yes; music, history" .
# Example
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?person rdf:type foaf:Person
}
WHERE {
?person ex:timestamp []
}
# Example.
# Remodelling! Change the identifier
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?person rdf:type foaf:Person ;
foaf:mbox ?mbox
?person ?predicate ?any
}
WHERE {
BIND(
CONCAT(
“http://www.ld4humanities/2019/participant/“,
MD5(str(?mbox))) AS ?person) .
?mbox ?predicate ?any
}
NAMED GRAPHS
SPARQL can query multiple RDF graphs together!
# Query. 15
# Search in multiple Graphs
SELECT
distinct ?type
FROM <http://data.open.ac.uk/context/youtube>
FROM <http://data.open.ac.uk/context/podcast>
FROM <http://data.open.ac.uk/context/openlearn>
FROM <http://data.open.ac.uk/context/course>
FROM <http://data.open.ac.uk/context/qualification>
WHERE{
[] a ?type
}
# Query. 16
# Search in multiple Graphs
SELECT
distinct ?g ?type
FROM NAMED <http://data.open.ac.uk/context/youtube>
FROM NAMED <http://data.open.ac.uk/context/podcast>
FROM NAMED <http://data.open.ac.uk/context/openlearn>
FROM NAMED <http://data.open.ac.uk/context/course>
FROM NAMED <http://data.open.ac.uk/context/
qualification>
WHERE{
GRAPH ?g { [] a ?type }
}
Videos from the Open University on YouTube.
YouTube videos are linked to courses and qualifications, which in
turn are linked to other entities (OpenLearn units, Podcasts,
Audios, and other Courses or Qualifications)
Find OU content related to a YouTube video from the YouTube
video:
https://www.youtube.com/watch?v=SYry6PYsL8o
http://data.open.ac.uk/youtube/SYry6PYsL8o
http://data.open.ac.uk
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix podcast: <http://data.open.ac.uk/podcast/ontology/>
prefix yt: <http://data.open.ac.uk/youtube/ontology/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix rkb: <http://courseware.rkbexplorer.com/ontologies/courseware#>
prefix saou: <http://data.open.ac.uk/saou/ontology#>
prefix dbp: <http://dbpedia.org/property/>
prefix media: <http://purl.org/media#>
prefix olearn: <http://data.open.ac.uk/openlearn/ontology/>
prefix mlo: <http://purl.org/net/mlo/>
prefix bazaar: <http://digitalbazaar.com/media/>
prefix schema: <http://schema.org/>
SELECT
distinct
(?related as ?identifier)
?type
?label
(str(?location) as ?link)
FROM <http://data.open.ac.uk/context/youtube>
FROM <http://data.open.ac.uk/context/podcast>
FROM <http://data.open.ac.uk/context/openlearn>
FROM <http://data.open.ac.uk/context/course>
FROM <http://data.open.ac.uk/context/qualification>
WHERE
{
?x schema:productID "SYry6PYsL8o" . # change the youtube id to any OU youtube video
?x yt:relatesToCourse ?course .
{
# related video podcasts
?related podcast:relatesToCourse ?course .
?related a podcast:VideoPodcast .
?related rdfs:label ?label .
optional { ?related bazaar:download ?location }
BIND( "VideoPodcast" as ?type ) .
} union {
# related audio podcasts
?related podcast:relatesToCourse ?course .
?related a podcast:AudioPodcast .
?related rdfs:label ?label .
optional { ?related bazaar:download ?location }
BIND( "AudioPodcast" as ?type ) .
} union {
# related openlearn units
?related a olearn:OpenLearnUnit .
?related olearn:relatesToCourse ?course .
BIND( "OpenLearnUnit" as ?type ) .
?related <http://dbpedia.org/property/url> ?location .
?related rdfs:label ?label .
} union {
# related qualifications (compulsory course)
?related a mlo:qualification .
?related saou:hasPathway/saou:hasStage/saou:includesCompulsoryCourse ?course .
BIND( "Qualification" as ?type ) .
?related rdfs:label ?label .
?related mlo:url ?location
}
} limit 200
Content recommendation
• SPARQL1.1 W3 Recommendation
- https://www.w3.org/TR/sparql11-query/
• YASGUI SPARQL editor
– http://yasgui.org/
Useful Links
Success Stories
Uses data.open.ac.uk to get
content recommendations (eg:
courses).
data.open.ac.uk drives the
click through which turns
OpenLearn visitors into OU
students!
Publish once, display
everywhere (from YouTube,
Audioboo, iTunesU, Podcast)
OpenLearn
h"p://www.open.edu/openlearn/
An open and freely
searchable database that
brings together a mass of
data about people’s
experiences of listening to
music of all kinds, in any
historical period and any
culture.
Reuse from LOD
Uses data.open.ac.uk as
publishing platform.
RDF, “natively”
The Listening Experience Database Project
h"p://led.kmi.open.ac.uk/
Feedback	welcome:	@enridaga	#kmiou
https://recogito.pelagios.org/
http://data.cnr.it/data/cnr/individuo/CNR
Semantic Scouting
http://webtemp.src.cnr.it/semanticscouting/
2009
Hybrid methods
• Most of the data is actually metadata, describes
resources, documents, people, and it is essentially
structured
• However, LD can be used to enhance content such as
text or music!
• Two case studies:
• @Albert - MIDI Linked Data Cloud
• FindLEr: find evidence of Listening Experiences
• Hands-On
LD with content
Hands-On
A basic recipe:
1. Text
2. Link to a LD Graph with Named Entity Recognition (NER)
- e.g. dbpedia
3. Explore the graph to find common nodes between
entities
4. Suggest subjects for the text
Case study: find relevant topics
Text
Senior academics and politicians have condemned UK universities for failing to tackle endemic
racism against students and staff after a Guardian investigation found widespread evidence of
discrimination in the sector.
University staff from minority backgrounds said the findings showed there was “absolute
resistance” to dealing with the problem. Responses to freedom of information (FoI) requests the
Guardian sent to 131 universities showed that students and staff made at least 996 formal
complaints of racism over the past five years.
Of these, 367 were upheld, resulting in at least 78 student suspensions or expulsions and 51 staff
suspensions, dismissals and resignations.
But even these official figures are believed to underestimate the scale of racism in higher
education, with two separate investigations by the Guardian and the Equality and Human Rights
Commission identifying hundreds more cases that were not formally investigated by universities.
Scores of black and minority ethnic students and lecturers have told the Guardian they were
dissuaded from making official complaints and either dropped their allegations or settled for an
informal resolution. They said white university staff were often reluctant toaddress racism, with
racial slurs treated as banter or an inevitable byproduct of freedom of speech, and institutional
racism poorly recognised.
https://www.theguardian.com/education/2019/jul/05/uk-universities-condemned-for-
failure-to-tackle-racism
https://www.dbpedia-spotlight.org/demo/
https://www.dbpedia-spotlight.org/demo/
<http://dbpedia.org/resource/United_Kingdom>
<http://dbpedia.org/resource/Endemism>
<http://dbpedia.org/resource/Racism>
<http://dbpedia.org/resource/Australian_Human_Rights_Commission>
<http://dbpedia.org/resource/Institutional_racism>
SPARQL query
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT (count(?node) as ?sc) ?obj WHERE {
?node dct:subject ?cat .
?cat skos:broader{0,5} ?obj .
VALUES (?node){
(<http://dbpedia.org/resource/United_Kingdom>)
(<http://dbpedia.org/resource/Endemism>)
(<http://dbpedia.org/resource/Racism>)
(<http://dbpedia.org/resource/Australian_Human_Rights_Commission>)
(<http://dbpedia.org/resource/Institutional_racism>)
}
}
group by ?obj
order by desc(?sc)
limit 10
Open Issues
• Interoperability between these repositories (how to align their ontologies and entity
names?) is usually partial
• Quality
• owl:sameAs is very rarely “same as”. See http://sameas.org
• Completeness
• Principled Low Commitment (e.g. 404, 406, …)
• How to distinguish entities and documents?
• Method on top of the “Follow your nose” approach still to be developed
• What about incoming links?
• Licences? Policies?
• Availability of open data (limited resources). Some proposals, e.g. Linked Data Fragments
• User interfaces for LD operations - not only visualisation - still missing
Open Issues
Link and Open Your Data
Scholars & Institutions in the humanities are very
good at building high quality databases (e.g. thesauri,
gazetteers) but most of them are still closed!
Some sources of inspiration …
• EUCLID Project: http://euclid-project.eu/
• Randy Connolly’s slides about Web Development: https://
www.slideshare.net/randyconnolly
• Linked Data Patterns book
• http://patterns.dataincubator.org/book/
Credits
Thank you!
@enridaga @aldogangemi

Weitere ähnliche Inhalte

Was ist angesagt?

2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
vafopoulos
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
vafopoulos
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked data
Sören Auer
 

Was ist angesagt? (20)

Machine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge GraphsMachine Learning with and for Semantic Web Knowledge Graphs
Machine Learning with and for Semantic Web Knowledge Graphs
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 
SWT Lecture Session 1 - Introduction
SWT Lecture Session 1 - IntroductionSWT Lecture Session 1 - Introduction
SWT Lecture Session 1 - Introduction
 
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
 
The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
Creating knowledge out of interlinked data
Creating knowledge out of interlinked dataCreating knowledge out of interlinked data
Creating knowledge out of interlinked data
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Semantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoSemantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years ago
 
Web Data Extraction: A Crash Course
Web Data Extraction: A Crash CourseWeb Data Extraction: A Crash Course
Web Data Extraction: A Crash Course
 

Ähnlich wie Ld4 dh tutorial

Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
Bernhard Haslhofer
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 

Ähnlich wie Ld4 dh tutorial (20)

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
 
Madrid Building blocks of Linked Data
Madrid Building blocks of Linked DataMadrid Building blocks of Linked Data
Madrid Building blocks of Linked Data
 
One day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic WebOne day workshop Linked Data and Semantic Web
One day workshop Linked Data and Semantic Web
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Webofdata
WebofdataWebofdata
Webofdata
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
Linked Data
Linked DataLinked Data
Linked Data
 
Keynote csws2013
Keynote csws2013Keynote csws2013
Keynote csws2013
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
Intro to Web Science (Fall 2013)
Intro to Web Science (Fall 2013)Intro to Web Science (Fall 2013)
Intro to Web Science (Fall 2013)
 
Introduction to lod
Introduction to lodIntroduction to lod
Introduction to lod
 
Web of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case StudiesWeb of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case Studies
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 

Mehr von Enrico Daga

Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Enrico Daga
 

Mehr von Enrico Daga (19)

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Knowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything ProjectKnowledge graph construction with a façade - The SPARQL Anything Project
Knowledge graph construction with a façade - The SPARQL Anything Project
 
Capturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchCapturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities research
 
Trying SPARQL Anything with MEI
Trying SPARQL Anything with MEITrying SPARQL Anything with MEI
Trying SPARQL Anything with MEI
 
The SPARQL Anything project
The SPARQL Anything projectThe SPARQL Anything project
The SPARQL Anything project
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Kürzlich hochgeladen

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

Ld4 dh tutorial

  • 1. Linked Data for the Humanities: methods and techniques Enrico Daga The Open University Aldo Gangemi Università di Bologna Tutorial @ DH2019, Utrecht, 8th July Albert Meroño-Peñuela  Vrije Universiteit Amsterdam Special Guest
  • 2. 14.00 Session I • Linked Data in a nutshell • Producing Linked Data 15.30 (Coffee break) 16.00 Session II • Consuming Linked Data • Hybrid Methods Welcome
  • 3. Linked Data in a nutshell
  • 4. Intro A bit of history
  • 5. Invented the web in 1989 (yeah!) Invented the semantic web in 1994 (duh?)
  • 6. “To a computer, then, the web is a flat, boring world devoid of meaning” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 7. “This is a pity, as in fact documents on the web describe real objects and imaginary concepts, and give particular relationships between them” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 8. “Adding semantics to the web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values.” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 9. “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 10.
  • 11. Linked Data is a way of publishing structured information that allows datasets to be connected and enriched by the means of links among their entities. • LD uses the World Wide Web as publishing platform • Based on W3C standards - open to everyone • Enables your data to refer to other data • … and other data to refer to yours! Linked Data in a nutshell h"ps://en.wikipedia.org/wiki/Linked_data
  • 12. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Linked Open Data in 2007
  • 15. Linked Data: The story so far (2009)
  • 23. • A principle: hypertext • A protocol: HTTP • An identification scheme: URNs/URIs • A language: HTML The traditional Web
  • 24. • A principle: hypertext • A protocol: HTTP • An identification scheme: URNs/URIs • A language: HTML RDF The semantic Web
  • 25. • Uniform Resource Identifiers (URIs) • To identify things • HyperText Transfer Protocol (HTTP) • To access data about them • Resource Description Framework (RDF) • a meta-model for data representation. • it does not specify a particular schema • offers a structure for representing schemas and data • SPARQL Protocol and Query Language (SPARQL) • To query LD databases directly on the Web Linked Data Technology Stack
  • 26. • A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. [RFC3986] • Syntax URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] • Example foo://example.com:8042/over/there?name=ferret#nose _/ _________________/_________/ __________/ __/ | | | | | scheme authority path query fragment HTTP URIs
  • 27. • URIs (Unique Resource Identifiers) are used to identify things (also called entities) in the real world • For instance: people, places, events, companies, products, movies, etc. A Web of Things
  • 28. HTTP Simplest thing ever • On top of • The Internet Protocol (IPv4) • Domain Name System (DNS): e.g. dbpedia.org • A Client / Server protocol: Request -> Response • Message structure: Headers + Body (content)
  • 29. Resource Description Framework Rela%onships between things are expressed by the means of a multi-directed, fully labeled graph where nodes could be resources or XMLSchema-typed values; rela%onships are also identified by URIs The RDF model (the “content” of the HTTP body…)
  • 30. RDF is based on an atomic element: the triple. Triple: (subject predicate object) - subject: a URI or a blank node - predicate: MUST be a URI - object: a URI, a blank node, or a literal The RDF Triple
  • 32. • Representation of data values • Serialization as strings • Interpretation based on the datatype • Literals without Datatype are treated as strings • and can be annotated with a language (Alpha-2): @en Literals Leipzig Burkhard Jung 51.3333latitude 12.3833 longitude 1958-03-07 born isMayorOf hasMayor
  • 33. • N-Triples (application/n-triples) • Turtle (text/turtle) • RDF/XML (application/rdf+xml) • N-Quads (application/n-quads) • TriG (application/trig) • […] RDF serializations
  • 35. • Namespaces in XML: https://www.w3.org/TR/xml-names/ • Namespaces end either with # or / • In serialisations, are mapped to prefixes, for brevity • http://prefix.cc to get help with namespaces and common prefixes • http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart • http://dbpedia.org/resource/ • dbr:Wolfgang_Amadeus_Mozart Namespaces
  • 36. Turtle https://www.w3.org/TR/turtle/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix dbo: <http://dbpedia.org/ontology/> . @prefix yago: <http://dbpedia.org/class/yago/> . @prefix wikidata: <http://www.wikidata.org/entity/> . dbr:Wolfgang_Amadeus_Mozart rdfs:label "Wolfgang Amadeus Mozart” ; rdf:type owl:Thing , yago:WikicatGermanClassicalComposers , yago:WikicatGermanComposers , dbo:Person . dbr:Wolfgang_Amadeus_Mozart owl:sameAs wikidata:Q254 ; dbo:deathPlace dbr:Vienna . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbr:Wolfgang_Amadeus_Mozart dbo:deathDate "1791-12-5"^^xsd:date ; dbo:birthPlace dbr:Salzburg ; dbo:birthDate "1756-1-27"^^xsd:date .
  • 37. • SPARQL: https://www.w3.org/TR/sparql11-overview/ • OWL: https://www.w3.org/OWL/ • RIF: https://www.w3.org/TR/rif-overview/ • SHACL: https://www.w3.org/TR/shacl/ • SPIN: https://spinrdf.org/ • RDFa: https://www.w3.org/TR/rdfa-syntax/ • JSON-LD: https://json-ld.org/ Many languages
  • 38. • Triple Stores: database management systems that allow to query RDF • RDF1.1 named graphs allow to integrate multiple RDF documents preserving the context of each triple: g s p o • Syntax: N-Quads Named Graphs
  • 39. 1. Use URIs to identify the “things” in your data 2. Use h2p:// URIs so people (and machines) can look them up on the web 3. When a URI is looked up, request/return a descrip%on of the thing in RDF 4. Include links to related things (e.g. owl:sameAs) Linked Data principles Something very basic http://www.w3.org/DesignIssues/LinkedData.html
  • 42. • Understand URI resolution • Grasp Content-Negotiation • Experience graph traversal • https://ld4humanities.github.io/ > Hands-On resources Objectives
  • 43. HTTP Simplest thing ever • On top of • The Internet Protocol (IPv4) • Domain Name System (DNS): e.g. dbpedia.org • A Client / Server protocol: Request -> Response • Message structure: Headers + Body https://www.slideshare.net/randyconnolly/chapter01-presentation-16514220
  • 44. Headers • Vary between Request and Response (two newlines) Body • Any data HTTP Message structure
  • 45. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart Access-Control-Allow-Origin: * … REQUESTRESPONSE
  • 46. cURL is a command line tool and library for transferring data with URLs wURL is a simple web app that allows non-Unix users to use cURL from a Web browser http://purl.org/ld4dh/wurl https://curl.haxx.se/ … let’s try …
  • 47. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart Access-Control-Allow-Origin: * … REQUESTRESPONSE
  • 48. GET /page/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP curl -v http://dbpedia.org/page/Wolfgang_Amadeus_Mozart HTTP/1.1 200 OK Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart […] <html> […]
  • 49. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: text/turtle HTTP curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
  • 50. GET /data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: text/turtle HTTP curl -v http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1 200 OK Content-Type: text/turtle; charset=UTF-8 Content-Length: 50708 […] @prefix dbo: <http://dbpedia.org/ontology/> . @prefix dbr: <http://dbpedia.org/resource/> . dbr:Amadeus_Mozart dbo:wikiPageRedirects dbr:Wolfgang_Ama dbr:The_Story_of_Mozart dbo:wikiPageRedirects dbr:Wolfga dbr:Mozartian dbo:wikiPageRedirects dbr:Wolfgang_Amadeus_
  • 51. curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart curl -v “http://dbpedia.org/page/Wolfgang_Amadeus_Mozart" curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” curl -v “http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl” curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” -L
  • 52. • In what formats is Mozart available? • text/html, application/rdf+xml, text/n-triples, text/turtle • Find Mozart's image (it’s a jpg) • When Mozart was born? • Where Mozart died? • How many inhabitants has the city today? How many Mozart? http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
  • 53. • Find the location of the experience • When did it happened? • Who is the listener? • What musical opera was performed? • What is the author of the listened music? • What is the performer? • What is the genre • Find information about this genre • Can you find other operas of the same genre? a Listening Experience http://data.open.ac.uk/led/lexp/1446304716352
  • 54. This type of task is possible using a SPARQL endpoint: http://dbpedia.org/sparql A scent of SPARQL “Find other operas of the same genre” SELECT * WHERE { ?entity <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Grand_operas> . }
  • 55. 1. https://www.theguardian.com/this-page-does-not-exists 2. urn:issn:23346587 3. issn:23346587 4. mailto:enrico.motta@open.ac.uk 5. dbr:Music Quiz Which of the following are not valid RDF IRIs?
  • 57. 1. Knowledge representation 1. Identify the source 2. Understand the content (domain) 3. Modelling: reuse or build an ontology 2. Produce RDF 1. Populate the ontology 2. Encode or (re)engineer in RDF - “triplification” 3. Put it on the Web and provide services to access and query the data 1. Support URI dereferencing (Content negotiation) 2. Expose a SPARQL Endpoint 3. Describe your dataset with Linked Data (ehm …start over) So you want to do Linked Data?
  • 58. • World’s academic communities has been dealing for years with knowledge representa%on • Ar%ficial intelligence, natural language processing, model management, and many other research fields largely contributed • Some ancestors traced the way How to represent knowledge?
  • 59.
  • 60.
  • 61.
  • 62. EXAMPLE • Instances are associated with one or several classes: Boddingtons rdf:type Ale . Grafentrunkrdf:type Bock . Hoegaarden rdf:type White . Jever rdf:type Pilsner .
  • 63. Ontologies different levels of detail & complexity Complexity Types Labels Descriptions Comments Class Hierarchies Relations Documented meaning Basic Logic Rules Inferences Transitivity Domain Range Rules Description Logic Reasoning Class unions Sets semantics Intersections Disjointness […] light-weight heavy-weight
  • 64. Copyright IKS Consortium • A vocabulary for describing properties and classes of RDF resources • rdfs:Resource • rdf:type • rdfs:Class • rdf:Property • rdfs:subClassOf • rdfs:subPropertyOf • rdfs:domain • rdfs:range RDF Schema http://www.w3.org/TR/rdf-schema/
  • 65. • OWL allows to specify other axioms • Property cardinality restric%ons • Classes disjunc%on • Property transi%vity • Cardinality constraints • But beware: more expressivity means more reasoning complexity The Web Ontology Language (OWL) formal language for automated reasoning
  • 66. The Web Ontology Language (OWL) formal language for automated reasoning :Novel rdf:type owl:Class. :Short_Story rdf:type owl:Class. :Poetry rdf:type owl:Class. :Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry). <myWork> rdf:type :Novel . <myWork> rdf:type :Literature . IF THEN
  • 68. • Schema layer of RDF • Defines terms (classes and properties) • Typically RDFS or OWL family • Reusability is important for supporting interoperability • Common vocabularies: Dublin Core, SKOS, FOAF, SIOC, vCard, DOAP, Core Organization Ontology, VoID Vocabularies light-weight semantics http://www.slideshare.net/prototypo/introduction-to-linked-data-rdf-vocabularies
  • 69. !69 Vocabulary: Friend-of-a-Friend (FOAF) defines classes and properties for representing
 information about people and their
 relationships Soeren rdf:type foaf:Person . Soeren currentProject http://OntoWiki.net . Soeren foaf:homepage http://aksw.org/Soeren . Soeren foaf:knows http://sembase.at/Tassilo . Soeren foaf:sha1 09ac456515dee .
  • 70. !70 Vocabulary: Semantically
 Interlinked Online Communities. Represent content from Blogs, Wikis, Forums, Mailinglists, Chats etc.
  • 71. !71 Vocabulary: Simple Knowledge Organization System (SKOS) support the use of thesauri, classification schemes, subject
 heading systems and taxonomies
  • 72.
  • 73. • DBpedia Ontology Schema: • manually created for DBpedia (infoboxes) • 1140 classes + 1149 object properties + 1741 datatype properties; >7K axioms (1537 on C, 2676 on OP, 3264 on DTP: 1.3, 2.3, 1.8 ratios); • (200M triples in DBpedia) • YAGO: • large hierarchy linking Wikipedia leaf categories to WordNet • 250,000 classes • UMBEL (Upper Mapping and Binding Exchange Layer): • 20000 classes derived from OpenCyc • DOLCE-Zero (Foundational Ontology, aligned to DBpedia): • 76 classes + 105 object properties + 5 datatype properties; 596 axioms (196 on C, 389 on OP, 11 on DTP: 2.4, 3.7, 2.2 ratios) • presence of “restrictions”, top-level disjointness, and patterns • Wikipedia Categories: • Not a class hierarchy (e.g. cycles), represented using SKOS • 415,000+ categories 2011/05/12 General Purpose Ontologies (different levels of detail & complexity)
  • 74. Domain Ontologies (different levels of detail & complexity) https://lov.linkeddata.es/dataset/lov/
  • 75. 1. From a Relational Database 2. From Web content (Scraping) 3. From XML or other structured data formats 4. From a data table (e.g. a CSV file) 5. From natural language (Sic!) How to produce RDF?
  • 76. • W3C R2RML - language to specify mappings between SQL databases and RDF: http://www.w3.org/TR/r2rml/ • D2RQ - allows to access relational databases as virtual graphs: http://d2rq.org/ • DB2Triples - runs a specified R2RML file and generates RDF: https://github.com/ antidot/db2triples 1. From a relational database
  • 78. • RDFa and microformats are used to embed semantic information (expressed using the RDF model) into regular HTML pages • RDFa does it using existing (rel) and additional (about, property, typeof) attributes • Microformats only use usual HTML attributes (class) • To extract, e.g., Apache any23: https://any23.apache.org 2. From Web pages
  • 79. DBpedia is the de-facto Hub of LOD. • descrip%ons of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases • labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages;
 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories • altogether over 1 billion pieces of informa%on (i.e. RDF triples): 257M from English edition, 766M from other language editions • DBpedia Live (http://live.dbpedia.org/sparql/) &
 Mappings Wiki (http://mappings.dbpedia.org)
 integrate the community into a refinement cycle
  • 80. Extracting structured information from Wikipedia and make this information available on the Web as LOD: • link other data sets on the Web to Wikipedia data (encyclopaedic knowledge) • ask sophisticated queries against Wikipedia (e.g. universities in Paris, mayors of towns in a certain region), • Represents a community consensus Transforming Wikipedia into a Knowledge Base
  • 81. Structure in Wikipedia • Title • Abstract • Infoboxes • Geo-coordinates • Categories • Images • Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations
  • 82. Infobox templates {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} http://dbpedia.org/resource/Busan dbp:Busan dbpp:title ″Busan Metropolitan City″ dbp:Busan dbpp:hangul ″부산 광역시″@Hang dbp:Busan dbpp:area_km2 ″763.46“^xsd:float dbp:Busan dbpp:pop ″3635389“^xsd:int dbp:Busan dbpp:region dbp:Yeongnam dbp:Busan dbpp:dialect dbp:Gyeongsang ... Wikitext-Syntax RDF representation
  • 83. 2011/05/12 83 DBpedia SPARQL Endpoint PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dct: <http://purl.org/dc/terms/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbc: <http://dbpedia.org/resource/Category:> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?birth ?description ?person WHERE { ?person dbo:birthPlace dbr:Berlin . ?person dct:subject dbc:German_musicians . ?person dbo:birthDate ?birth . ?person foaf:name ?name . ?person rdfs:comment ?description . FILTER (LANG(?description) = 'en') . } ORDER BY ?name
  • 84. 2011/05/12 • hosted on a OpenLink Virtuoso server • can answer SPARQL queries like • Give me all Sitcoms that are set in NYC? • All tennis players from Moscow? • All films by Quentin Tarentino? • All German musicians that were born in Berlin in the 19th century? • All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants? DBpedia SPARQL Endpoint http://dbpedia.org/sparql
  • 85. • Two steps: • Remodelling task • Reengineering task • Web APIs • JSON: annotate with JSON-LD https://json-ld.org/ • XML • XML != RDF • XML serialisation of DOM (tree), RDF is a graph instead, no root. • eXtensible Stylesheet Language Transformations (XSLT) to generate a RDF format, e.g. N-Triples 3. From Web APIs, XML or other formats
  • 86. • data.open.ac.uk is the home of The Open University LOD • 2010, OU first university in the UK to publish LOD. • Collects and interlinks open data from institutional repositories of the University, and makes it available as LD data.open.ac.uk
  • 87. Open Educational Resources • Metadata about educational resources produced or co-produced by The Open University • OU/BBC Coproductions | OU podcasts | OpenLearn | Videofinder Scientific Production • Metadata about scientific production of The Open University • Open Research Online (http:// oro.open.ac.uk/) Social Media • Content hosted by social media web sites. • Metadata are extracted from public APIs and aggregated into RDF. • Audioboo | YouTube Datasets http://data.open.ac.uk Organisational • Data collected form internal repositories and first made public as linked data. • The OU's Key Information Set from Unistats | OU People Profiles | KMi People Profiles | Open University data XCRI-CAP 1.2 | Qualifications | Courses | OU Planet Stories Data from Research Projects • Linked Data from research projects. • Arts and Humanities Research Council project metadata | The Listening Experience Database | The UK Reading Experience Database | The Reading Experience Database: DBpedia alignments
  • 88. • Two tasks: remodelling & reengineering • Homemade recipe: 1. Find your identifier(s), establish namespaces 2. Map columns to predicates, establish cell value type (URI or Literal) 3. Iterate over the rows 4. Generate a triple for each cell 4. From a data table
  • 89. • A Google Form Spreadsheet • Prepare column names (first row) • Identify the Subject column (S) • Generate a tuple for each column value (S, c, v) - G SQL • Clean: remove tuples with empty values • Format tuples into valid N3 triples Example (only reengineering) https://docs.google.com/spreadsheets/d/ 1j_LHZIOhkbD61r7fSxuf4017tgbOoL_Z6tLT0oDQz_0/edit?usp=sharing
  • 90. 1. Load the data into a Triple Store • Virtuoso Open Source: virtuoso.openlinksw.com • Apache Jena: http://jena.apache.org/ • Blazegraph: www.blazegraph.com • https://en.wikipedia.org/wiki/Comparison_of_triplestores 2. Publish the SPARQL Endpoint 3. Setup content negotiation • http://www.example.com/… 303 to SPARQL DESCRIBE <http://www.example.com/...> How to publish on the Web? (signposting only here)
  • 91. Coffee break See you at 4pm (sharp!)
  • 94. • Understand triple patterns • Try with some features of the language • https://ld4humanities.github.io/ > Hands-On resources Objectives
  • 95. SPARQL SPARQL Protocol And RDF Query Language
  • 96. Triple and Graph Patterns How do we describe the structure of the RDF graph which we're interested in?
  • 97.
  • 98. # An RDF triple in Turtle syntax PREFIX dbr: <http://dbpedia.org/resource/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
  • 99. # A SPARQL triple pattern, with a single variable PREFIX dbr: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
  • 100. # All parts of a triple pattern can be variables ?subject foaf:name ?name.
  • 101.
  • 102. # Matching labels of resources PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> ?subject rdfs:label ?label.
  • 103.
  • 104. # Combine triples patterns to create a graph pattern PREFIX dby: <http://dbpedia.org/class/yago/> ?subject rdfs:label ?label . ?subject rdf:type dby:WikicatOperaComposers . # SPARQL is based on Turtle, which allows abbreviations # e.g. predicate-object lists: ?subject rdfs:label ?label; rdf:type dby:WikicatOperaComposers .
  • 105.
  • 106. # Graph patterns allow us to traverse a graph ?person rdfs:label “Wolfgang Amadeus Mozart”@de . ?person dbo:deathPlace ?place . ?place dbo:populationTotal ?population .
  • 107. #Graph patterns allow us to traverse a graph ?person rdfs:label “Wolfgang Amadeus Mozart”@de . ?person dbo:deathPlace ?place . ?place dbo:populationTotal ?population .
  • 108.
  • 109. Structure of a Query What does a basic SPARQL query look like?
  • 110. # Query. 1 # Associate URIs with prefixes PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> # Example of a SELECT query, retrieving 2 variables # Variables selected MUST be bound in graph pattern SELECT ?person ?label WHERE { #This is our graph pattern ?person rdfs:label “Wolfgang Amadeus Mozart”@de ; dbo:deathPlace ?place . ?place dbo:populationTotal ?population }
  • 111. • https://ld4humanities.github.io/ > Hands-On resources • We will use this UI: http://yasgui.org/ • Credits: Let’s try it out http://about.yasgui.org/ http://laurensrietveld.nl/
  • 112. # Query. 2 PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> # Example of a SELECT query, retrieving all variables SELECT * WHERE { ?person rdfs:label “Wolfgang Amadeus Mozart”@de ; dbo:deathPlace ?place . ?place dbo:populationTotal ?population . }
  • 113. OPTIONAL bindings How do we allow for missing or unknown information?
  • 114. # Query. 3 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?name ?image WHERE { #This pattern must be bound ?person rdfs:label "Wolfgang Amadeus Mozart"@de ; dbo:birthPlace ?place . #Anything in this block doesn't have to be bound OPTIONAL { ?place dbo:populationTotal ?population . } }
  • 115. UNION queries How do we allow for alternatives or variations in the graph?
  • 116. # Query. 4 PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?person ?place WHERE { { ?person dbo:deathPlace ?place . } UNION { ?person dbo:birthPlace ?place . } }
  • 117. Sorting & Restrictions How do we apply a sort order to the results? How can we add restrictions? How can we restrict the number of results returned?
  • 118. # Query. 5 # Select the URI and population of all places PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . }
  • 119. # Ex. 6 # Select the URI and population of all places # with highest first PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . } # Use an ORDER BY clause to apply a sort. # Can be ASC or DESC ORDER BY DESC(?population)
  • 120. # Ex. 7 # Select the URI and population of a city # with highest first PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . FILTER EXISTS { ?place dbp:countryCode [] } } # Use an ORDER BY clause to apply a sort. # Can be ASC or DESC ORDER BY DESC(?population)
  • 121. # Ex. 8 # Select the URI and population of the 11-20th most populated countries PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . FILTER EXISTS { ?place dbp:countryCode [] } } # Use an ORDER BY clause to apply a sort. ORDER BY DESC(?population) # Limit to first ten results LIMIT 10 # Apply an offset to get next “page” OFFSET 10
  • 122. Filtering How do we restrict results based on aspects of the data rather than the graph, e.g. string matching?
  • 123. # In the following triple the literal has assigned a # datatype to indicate it is a date PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> dbr:Wolfgang_Amadeus_Mozart dbo:birthDate "1756-1-27"^^xsd:date
  • 124. # Query. 9 # Select name of persons born between 1st Jan 1756 and 1st Jan 1757 PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?name WHERE { ?person dbo:birthDate ?date; foaf:name ?name. FILTER (?date > "1756-01-01"^^xsd:date && ?date < "1757-01-01"^^xsd:date) }
  • 125. # Query. 10 # Select the URI and population of places with an area below 20km^2, with most populated first PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbpp: <http://dbpedia.org/ontology/PopulatedPlace/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population ; dbpp:areaTotal ?area . # Note that we have to cast the data to the right type # As it is not declared in the data FILTER( xsd:double(?area) < 20 ) } ORDER BY DESC(?population)
  • 126. # Query. 11 # Select persons named Wolfgang PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?subject ?name WHERE { ?subject foaf:name ?name ; dbo:deathPlace dbr:Vienna . FILTER( regex(?name, "Wolfgang", "i" ) ) }
  • 127. • Logical: !, &&, || • Math: +, -, *, / • Comparison: =, !=, >, <, ... • Variable tests: isURI, isBlank, isLiteral, bound • Accessors: str, lang, datatype • Other: sameTerm, langMatches, regex Built-In Filters
  • 128. DISTINCT How do we remove duplicate results?
  • 129. # Query. 12 # Select list of places that gave birth to german classical composers PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT DISTINCT ?place WHERE { [] dbo:birthPlace ?place ; dct:subject dbc:German_classical_composers }
  • 130. SPARQL Query Forms Does SPARQL do more than just SELECT data?
  • 131. ASK Test whether the graph contains some data of interest
  • 132. # Query. 13 # Is Mozart’s date of birth 1756-1-27? PREFIX dbr: <http://dbpedia.org/resource/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> ASK WHERE { dbr:Mozart space:launched "1756-1-27"^^xsd:date . } # ASK returns a boolean value
  • 133. DESCRIBE Generate an RDF description of a resource(s)
  • 134. # Query. 14 # Describe persons born in 1757 PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dbp: <http://dbpedia.org/property/> DESCRIBE ?person { ?person dbp:birthDate ?date . FILTER ( ?date < "1958-01-01"^^xsd:date && ?date >= "1757-01-01"^^xsd:date ) }
  • 135. CONSTRUCT Create a custom RDF graph based on query criteria Can be used to transform RDF data
  • 136. @prefix exp: <http://www.example.com/property#> . <mailto:albert.meronyo@gmail.com> exp:timestamp "05/06/2019 14:38:16" ; exp:xml_level "Basic" ; exp:rdf_level "Expert" ; exp:sparql_level "Expert" ; exp:web_level "Expert" ; exp:ontology_level "Expert" ; exp:ld_publishing_level "Expert" ; exp:ld_consumption "Expert" ; exp:ld_ui_level "Expert" ; exp:interested_in "Advanced Ontology Design, Linked Data Production, Linked Data Consumption, Success stories, Advanced Linked Data Techniques, Advanced SPARQL" ; exp:known_projects "JazzCats, LinkezBrainz, MIDI Linked Data cloud, LOD Laundromat" ; exp:plans_of_use "Yes; music, history" .
  • 137. # Example PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?person rdf:type foaf:Person } WHERE { ?person ex:timestamp [] }
  • 138. # Example. # Remodelling! Change the identifier PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?person rdf:type foaf:Person ; foaf:mbox ?mbox ?person ?predicate ?any } WHERE { BIND( CONCAT( “http://www.ld4humanities/2019/participant/“, MD5(str(?mbox))) AS ?person) . ?mbox ?predicate ?any }
  • 139. NAMED GRAPHS SPARQL can query multiple RDF graphs together!
  • 140. # Query. 15 # Search in multiple Graphs SELECT distinct ?type FROM <http://data.open.ac.uk/context/youtube> FROM <http://data.open.ac.uk/context/podcast> FROM <http://data.open.ac.uk/context/openlearn> FROM <http://data.open.ac.uk/context/course> FROM <http://data.open.ac.uk/context/qualification> WHERE{ [] a ?type }
  • 141. # Query. 16 # Search in multiple Graphs SELECT distinct ?g ?type FROM NAMED <http://data.open.ac.uk/context/youtube> FROM NAMED <http://data.open.ac.uk/context/podcast> FROM NAMED <http://data.open.ac.uk/context/openlearn> FROM NAMED <http://data.open.ac.uk/context/course> FROM NAMED <http://data.open.ac.uk/context/ qualification> WHERE{ GRAPH ?g { [] a ?type } }
  • 142. Videos from the Open University on YouTube. YouTube videos are linked to courses and qualifications, which in turn are linked to other entities (OpenLearn units, Podcasts, Audios, and other Courses or Qualifications) Find OU content related to a YouTube video from the YouTube video: https://www.youtube.com/watch?v=SYry6PYsL8o http://data.open.ac.uk/youtube/SYry6PYsL8o http://data.open.ac.uk prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix podcast: <http://data.open.ac.uk/podcast/ontology/> prefix yt: <http://data.open.ac.uk/youtube/ontology/> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix rkb: <http://courseware.rkbexplorer.com/ontologies/courseware#> prefix saou: <http://data.open.ac.uk/saou/ontology#> prefix dbp: <http://dbpedia.org/property/> prefix media: <http://purl.org/media#> prefix olearn: <http://data.open.ac.uk/openlearn/ontology/> prefix mlo: <http://purl.org/net/mlo/> prefix bazaar: <http://digitalbazaar.com/media/> prefix schema: <http://schema.org/> SELECT distinct (?related as ?identifier) ?type ?label (str(?location) as ?link) FROM <http://data.open.ac.uk/context/youtube> FROM <http://data.open.ac.uk/context/podcast> FROM <http://data.open.ac.uk/context/openlearn> FROM <http://data.open.ac.uk/context/course> FROM <http://data.open.ac.uk/context/qualification> WHERE { ?x schema:productID "SYry6PYsL8o" . # change the youtube id to any OU youtube video ?x yt:relatesToCourse ?course . { # related video podcasts ?related podcast:relatesToCourse ?course . ?related a podcast:VideoPodcast . ?related rdfs:label ?label . optional { ?related bazaar:download ?location } BIND( "VideoPodcast" as ?type ) . } union { # related audio podcasts ?related podcast:relatesToCourse ?course . ?related a podcast:AudioPodcast . ?related rdfs:label ?label . optional { ?related bazaar:download ?location } BIND( "AudioPodcast" as ?type ) . } union { # related openlearn units ?related a olearn:OpenLearnUnit . ?related olearn:relatesToCourse ?course . BIND( "OpenLearnUnit" as ?type ) . ?related <http://dbpedia.org/property/url> ?location . ?related rdfs:label ?label . } union { # related qualifications (compulsory course) ?related a mlo:qualification . ?related saou:hasPathway/saou:hasStage/saou:includesCompulsoryCourse ?course . BIND( "Qualification" as ?type ) . ?related rdfs:label ?label . ?related mlo:url ?location } } limit 200 Content recommendation
  • 143. • SPARQL1.1 W3 Recommendation - https://www.w3.org/TR/sparql11-query/ • YASGUI SPARQL editor – http://yasgui.org/ Useful Links
  • 145. Uses data.open.ac.uk to get content recommendations (eg: courses). data.open.ac.uk drives the click through which turns OpenLearn visitors into OU students! Publish once, display everywhere (from YouTube, Audioboo, iTunesU, Podcast) OpenLearn h"p://www.open.edu/openlearn/
  • 146. An open and freely searchable database that brings together a mass of data about people’s experiences of listening to music of all kinds, in any historical period and any culture. Reuse from LOD Uses data.open.ac.uk as publishing platform. RDF, “natively” The Listening Experience Database Project h"p://led.kmi.open.ac.uk/ Feedback welcome: @enridaga #kmiou
  • 151. • Most of the data is actually metadata, describes resources, documents, people, and it is essentially structured • However, LD can be used to enhance content such as text or music! • Two case studies: • @Albert - MIDI Linked Data Cloud • FindLEr: find evidence of Listening Experiences • Hands-On LD with content
  • 153. A basic recipe: 1. Text 2. Link to a LD Graph with Named Entity Recognition (NER) - e.g. dbpedia 3. Explore the graph to find common nodes between entities 4. Suggest subjects for the text Case study: find relevant topics
  • 154. Text Senior academics and politicians have condemned UK universities for failing to tackle endemic racism against students and staff after a Guardian investigation found widespread evidence of discrimination in the sector. University staff from minority backgrounds said the findings showed there was “absolute resistance” to dealing with the problem. Responses to freedom of information (FoI) requests the Guardian sent to 131 universities showed that students and staff made at least 996 formal complaints of racism over the past five years. Of these, 367 were upheld, resulting in at least 78 student suspensions or expulsions and 51 staff suspensions, dismissals and resignations. But even these official figures are believed to underestimate the scale of racism in higher education, with two separate investigations by the Guardian and the Equality and Human Rights Commission identifying hundreds more cases that were not formally investigated by universities. Scores of black and minority ethnic students and lecturers have told the Guardian they were dissuaded from making official complaints and either dropped their allegations or settled for an informal resolution. They said white university staff were often reluctant toaddress racism, with racial slurs treated as banter or an inevitable byproduct of freedom of speech, and institutional racism poorly recognised. https://www.theguardian.com/education/2019/jul/05/uk-universities-condemned-for- failure-to-tackle-racism
  • 157. SPARQL query PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX dct: <http://purl.org/dc/terms/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT (count(?node) as ?sc) ?obj WHERE { ?node dct:subject ?cat . ?cat skos:broader{0,5} ?obj . VALUES (?node){ (<http://dbpedia.org/resource/United_Kingdom>) (<http://dbpedia.org/resource/Endemism>) (<http://dbpedia.org/resource/Racism>) (<http://dbpedia.org/resource/Australian_Human_Rights_Commission>) (<http://dbpedia.org/resource/Institutional_racism>) } } group by ?obj order by desc(?sc) limit 10
  • 158.
  • 160. • Interoperability between these repositories (how to align their ontologies and entity names?) is usually partial • Quality • owl:sameAs is very rarely “same as”. See http://sameas.org • Completeness • Principled Low Commitment (e.g. 404, 406, …) • How to distinguish entities and documents? • Method on top of the “Follow your nose” approach still to be developed • What about incoming links? • Licences? Policies? • Availability of open data (limited resources). Some proposals, e.g. Linked Data Fragments • User interfaces for LD operations - not only visualisation - still missing Open Issues
  • 161. Link and Open Your Data Scholars & Institutions in the humanities are very good at building high quality databases (e.g. thesauri, gazetteers) but most of them are still closed!
  • 162. Some sources of inspiration … • EUCLID Project: http://euclid-project.eu/ • Randy Connolly’s slides about Web Development: https:// www.slideshare.net/randyconnolly • Linked Data Patterns book • http://patterns.dataincubator.org/book/ Credits