SlideShare ist ein Scribd-Unternehmen logo
1 von 80
Downloaden Sie, um offline zu lesen
Exploration, Visualization and
Querying of
Linked Open Data sources
2nd Keystone Training School - Keyword Search in Big Linked Data
Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS), University of Santiago de
Compostela (USC), Spain.
Laura Po
Department of Engineering «Enzo Ferrari»
University of Modena and Reggio Emilia
Italy
MODENA
Outline
• Introduction to Linked Open Data
• Searching for LOD datasets
• Exploring a dataset
• Visualization tools
• Querying a SPARQL Endpoint
MORNING SESSION
AFTERNOON HANDS-ON SESSION
Searching for LOD datasets
• Portals that collects datasets
• Datahub – a portal that collect datasets
• DataPortals.org- a portal that maintains a list of open data portals in the world
• International or national open data portals
• EU Open Data Portal is the single point of access to a wide range of data held by EU public
administrations at all levels of government, agencies and other bodies, that allows access in all 24 EU
official languages
• European Union Open Data portal – the Open Data portal for the European Commission and other
institutions of the European Union.
• Popular Datasets
• Wikidata - a collaboratively-created linked dataset that acts as central storage for the structured data
of its Wikimedia sister projects
• DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts
described by 1 billion triples, including abstracts in 11 different languages
• GeoNames provides RDF descriptions of more than 7,500,000 geographical features worldwide.
• FOAF – a dataset describing persons, their properties and relationships
DataHub
• Datahub collects more
than 10.000 datasets
• It is a data management
platform from the Open
Knowledge Foundation,
based on the CKAN data
management system.
• CKAN is a tool for
managing and publishing
collections of data. It is
used by national and local
governments, research
institutions, and other
organisations which
collect a lot of data.
Search on Datahub
• To find datasets, type any combination of search words (e.g. “health”,
“transport”, etc) in the search box on any page. CKAN displays the
first page of results for your search. You can:
• View more pages of results
• Repeat the search, altering some terms
• Restrict the search to datasets with particular tags, data formats, etc using the
filters in the left-hand column
• If datasets are tagged by geographical area, it is also possible to run
CKAN with an extension which allows searching and filtering of
datasets by selecting an area on a map.
Exploring datasets
• When you have found a dataset you are interested and selected it, CKAN will
display the dataset page. This includes
• The name, description, and other information about the dataset
• Links to and brief descriptions of each of the resources
• The resource descriptions link to a dedicated page for each resource. This
resource page includes information about the resource, and enables it to be
downloaded.
• Many types of resource can also be previewed directly on the resource page. .CSV
and .XLS spreadsheets are previewed in a grid view, with map and graph views
also available if the data is suitable. The resource page will also preview resources
if they are common image types, PDF, or HTML.
• The dataset page also has two other tabs:
• Activity stream – see the history of recent changes to the dataset
• Related items – see any links to web pages related to this dataset, or add your own links.
Exercise 1 – Datahub
• Find the dbpedia dataset in Datahub
• Look at the possible way the dataset can be accessed
• Find datasets about Santiago
• Can you find some interesting data source?
• How much information are given on these data?
• How many formats and access points to the datasets are available?
International or national open data portals
• The European Data Portal harvests the metadata of Public Sector
Information available on public data portals across European
countries. Information regarding the provision of data and the
benefits of re-using data is also included.
• Public sector information is information held by the public sector. The Directive
on the re-use of public sector information provides a common legal framework
for a European market for government-held data.
Improving the accessibility and Value of OGD
• The strategic objective of the European Data Portal is to improve
accessibility and increase the value of Open Governament Data:
• Accessibility: How to access this information? Where to find it? How
to make it available in the first place? In domains, across domains,
across countries? In what language?
• Value: For what purpose and what economic gain? Societal gain?
Democratic gain? In what format? What is the critical mass?
• The European Data Portal addresses the whole data value chain: from
data publishing to data re-use.
A checklist for using Open Data
Having access to data is a first step. Data is not an end in itself. Data can be used in different ways
and for different purposes. Data can also be available with different licences, formats and quality.
• Define your purpose: You might specify a topic or a service or an application of interest
• Identify data labels: Filter the data labels and metadata
• Check Openness: Take a look at the licence information. Make sure a licence is available which
allows you to make use of the data in the way that you intend (e.g. that commercial re-use is
allowed if you develop a commercial application).
After you have decided that a specific data set is exactly what you are looking for
• Select the useful file format - you are probably able to choose to download the datasets in
different file formats. Depending on your computer skills, you can choose the file type that is
most appropriate. Most datasets are available in an open file format.
• Check the data quality – check the last date the file was modified, check whether information
about the time period is provided.
A checklist for using Open Data
Having access to data is a first step. Data is not an end in itself. Data can be used in different ways
and for different purposes. Data can also be available with different licences, formats and quality.
• Define your purpose: You might specify a topic or a service or an application of interest
• Identify data labels: Filter the data labels and metadata
• Check Openness: Take a look at the licence information. Make sure a licence is available which
allows you to make use of the data in the way that you intend (e.g. that commercial re-use is
allowed if you develop a commercial application).
After you have decided that a specific data set is exactly what you are looking for
• Select the useful file format - you are probably able to choose to download the datasets in
different file formats. Depending on your computer skills, you can choose the file type that is
most appropriate. Most datasets are available in an open file format.
• Check the data quality – check the last date the file was modified, check whether information
about the time period is provided.
• Form
• how has the data been processed?
• is it in raw or summary form?
• how will its form affect your
analysis/product/application?
• what syntactic (language) and semantic (meaning)
transformations will you need to make?
• is this compatible with other datasets you have?
• Quality
• how current is the data?
• how regularly is it updated?
• do you understand all the fields and their context?
• for how long will it be published? what is the
commitment by the publisher?
• what do you know about the accuracy of the data?
• how are missing data handled?
Exercise 2 – EDP
• Choose one category on the EDP and
find datasets describing one specific
topic (for example in the category of
transport the topic could be cycling
routes)
General analysis
• How many datasets are available?
• What are the main datasets?
Local analysis
• How many information about your
country are available?
• How many formats and access points
to the datasets are available?
National open data portals
• Notable examples of Open Data portals maintained by public administrations in Europe
are:
• France
• opendata.paris.fr
• www.data.gouv.fr
• Italy
• www.dati.piemonte.it
• www.dati.gov.it
• Netherland
• www.data.overheid.nl
• UK
• data.gov.uk
Open data websites in Europe
International
publicdata.eu
data.un.org
data.worldbank.org
EU Member States
data.gov.be
opendata.government.bg
opendata.cz
portal.opendata.dk
govdata.de
opendata.ee
data.gov.ie
data.gov.gr
datos.gob.es
data.gouv.fr
data.gov.hr
dati.gov.it
data.gov.cy
opendata.gov.lt
data.public.lu
data.gov.mt
data.overheid.nl
data.gv.at
danepubliczne.gov.pl
dados.gov.pt
data.gov.ro
nio.gov.si/nio/
data.gov.sk
avoindata.fi
oppnadata.se
data.gov.uk
Exercise 3 – Find national Open Data portals
• Find the government open data portal from your member state
…some suggestions:
• Search in the list of EU member states open data websites
• Search in DataCatalogs
• Search in Google
• How many portals that collect open data are available in your member
state?
• How many datasets are collected in the portals?
• Have you already used some of these data?
Ranking of the national open data datasets
Global Open Data Index – is an annual report to measure the state of open
government data around the world. The goal is to provide a civil society audit of
how governments actually publish data - with input and review from citizens and
organisations around the world.
• Topical experts review datasets from different country, establish a baseline and track changes and
trends in the open data world over time as the field evolves.
Open Data Barometer - aims to uncover the impact of open data initiatives around
the world. It analyses global trends, and provides comparative data on countries
and regions via an in-depth methodology combining contextual data, technical
assessments and secondary indicators to explore multiple dimensions of open data
readiness, implementation and impact.
• This is the second edition of the Open Data Barometer. The Open Data Barometer forms part of
the World Wide Web Foundation’s work on common assessment methods for open data.
Exercise 4 – Find the ranking
• By using the information on Open Data Barometer and Global Open
Data index, find the ranking for the Open Data Portals/Initiatives of
your member state
Some questions
• Does the National Maps have an open lincense?
• Is the Government Budget publicly available?
• How are the Government policies of your country compared to the
mean of Europe and Central Asia?
• How high/low is the impact of open data in your country?
Exploring the Web of Data
• Linked Data Browsers - generic Linked Data browsers which allow users to start browsing in one data source
and then navigate along links into related data sources
• OpenLink Data Explorer a Web browser extension, and a server-side component of the OpenLink Ajax
Toolkit.
• Marbles tabular Linked Data browser supporting Fresnel.
• Sigma, Live views on the Web of Data
• Quick & Dirty RDF Browser Simple RDF browser. Useful for checking RDF or RDFa says what you intended.
• Graphity Client Generic Linked Data browser and platform for building declarative SPARQL triplestore-
backed Web applications. Apache license.
• Linked Data mashups
• Revyu by Tom Heath. Uses Linked Data from DBpedia to augment reviews, for instance with information
about a director for a film.
• DBpedia Mobile by Christian Becker and Chris Bizer. Combines Linked Data from DBpedia, the flickr
wrapper, and Revyu.
• Music Mashup by Yves Raimond. Combines Linked Data from various music related data sources.
• Linked Data Search engines - crawl the Web of Data by following links between data sources and provide
expressive query capabilities over aggregated data
Linked Data Browsers
http://marbles.sourceforge.net
Marbles
Linked Data Mashup
http://revyu.com
Revyu.com
Linked Data Mashup
DBPedia
Mobile Pictures from revyu.com
http://wiki.dbpedia.org/DBPediaMobile
Linked Data Mashup
http://sig.ma
SIGMA
Linked Data Search Engines
http://data.nytimes.com/schools/schools.html
NYTimes
Some Application Scenarios
BBC Music
Some Application Scenarios
BBC Music
Some Application Scenarios
LinkedGeoData.org
LinkedGeoData adds a spatial dimension to the Web of Data /
Semantic Web.
LinkedGeoData uses the information collected by the
OpenStreetMap project and makes it available as an
RDF knowledge base according to the Linked Data principles.
It interlinks this data with other knowledge bases in the Linking
Open Data initiative.
Exercise 5 – OpenLink exploration - Facebook to Linked
Data Transformation Examples
• Install the OpenLink Data Explorer (ODE) extension for your browser (currently available for Firefox, Safari,
Chrome, Opera, and Internet Explorer)
• This extension will allow you to explore the raw data and entity relationships that underlay the Web resources it
processes.
• Select your Facebook Profile Page (or another person Facebook Profile page)
• Right-Click (or Ctrl-Click on Mac) on the page and then click on "View Page Description" to obtain a
descriptions of the resources available on the linked page
• A description of the resource Metadata available on the page is displayed.
• More example: If you what to perform some other researches look at the example page
Visualization of Linked Data
• Why is it important?
• Actually the consumption of LOD is restricted to the Semantic Web
community
• Visual tools that provide a coherent and legible picture of the data
allow also non-technical audience
• to obtain a good understanding of the data structure,
• and to compose query,
• identify links between resources
• and intuitively discover new pieces of information
What is visualization
• The visualization of information
• Goals:
• Effective communication of information
• Clarity
• Integrity (all the information)
• Stimulate viewer engagment
• Focus on effectiveness
Why is visualization important?
• With lage datasets we need an efficint way to understand a vast
amount of data
• The human visual system is the highest- bandwith channel to the
human brain
Why visualize data instead of provide statistic
analysis?
http://en.wikipedia.org/wiki/Anscombe's_quartet
• Anscombe's
quartet of datasets
having similar
statistical
properties but
appearing very
different when
plotted
Example of the Linked Data visualization process
Heatmap visualization of Beatles releases
Visualization, exploration and query tools
LOD live
LodLive project provides a demonstration
of the use of Linked Data standards (RDF,
SPARQL) to browse RDF resources. The
application aims to spread linked data
principles using a simple and friendly
interface with reusable techniques.
http://en.lodlive.it/
http://en.lodlive.it/?http://dbpedia.org/resource/Jules_Verne
Exercise 6 - LODLive
By using LodLive online to explore dbpedia resources, search for Serena
Williams http://en.lodlive.it/
- who is she?
- where does she live?
- where does she is list as a champion actually (find and explore the
"currentChampion" relation)?
- find the statistics and records associated to her, navigate to the wikipedia
page, and discover what is the total win rate of Serena in Single disciplines
Visualbox
Visualbox allows you to create visualizations based on Linked Open Data. The
goal of Visualbox is to facilitate the creation of visualization without the need
to learn Javascript libraries. You do need to know a bit of SPARQL and some
notions of HTML though.
Visualbox is a simplified version of LODSPeaKr, a framework to create Linked
Data-based applications.
http://orion.tw.rpi.edu/~agraves/mozfest/index.html
VisualBox – some example
http://orion.tw.rpi.edu/~agraves/mozfest/action
http://orion.tw.rpi.edu/~agraves/mozfest/firesock_test
LODEXIt is a tool for producing a representative summary of a Linked open Data (LOD)
source starting from scratch, thus supporting users in exploring and
understanding the contents of a dataset.
LODeX extracts statistical indexes that uses to build the representative summary,
by quering the SPARQL endpoint of a LOD source.
Two online versions:
• LODeX 2.0 (http://www.dbgroup.unimo.it/lodex2 ) includes the possibility to
compose visual queries by selecting objects from the representative summary
of a LOD source
• LODeX Cluster (http://www.dbgroup.unimo.it/lodex2/testCluster ) provides a
more concise schema for huge datasets
LODeX Architecture
Two main modules
• Extraction & Summarization
– Index Extraction (IE)
– Post Processing (PP)
LOD Cloud
SPARQL
Queries
LODeX
Post-
processing
Statistical
Indexes
LODeX
Indexes
Extraction
Endpoint
URLs
Schema
Summary
NoSQL
SPARQL
Queries
Schema
Summary
Query
Orchestrator
Schema
Summary
Visualizzation
Basic
QueryResults
• Visualization & Querying
– Schema Summary
Visualization
– Query Orchestrator
The Schema Summary is a pseudograph composed by:
C - Classes (nodes)
P - Properties (edges)
And additional elements and function:
A - Attributes associated to each class
Each attribute represent the existence of a Datatype property from the
instances of the class
σ 𝒍 -labels
l – labeling function
count - count function
The Schema Summary is inferred by the distribution of the
instances of a dataset
The Schema summary
A running example
ex:Sector foaf:Organization
owl:Class
ex:sector
“sector”
rdf:type rdf:type
rdf:Propertyrdf:type
owl:ObjectProperty
rdf:type
sector1 organization1ex:sector
dc:title
“Energy”
Extensional
Classes
Extensional
Knowledge
Intensional
Knowledge
ex:activity
“Village electrification
in the Pacific”
organization2 “+41331231”
rdfs:label
rdfs:label
rdfs:domain
rdf:type
ex:sector
rdf:type rdf:type
dbpedia:fax
person1
foaf:Person
ex:activity
“Paolo”
rdf:type
ex:ceo
rdf:type foaf:firstName
foaf:lastName “Rossi”
The information contained in the Intensional knowledge can be incomplete
or absent
Indexes needed to generate a Schema Summary
These indexes belong to extensional group of the Statistical Indexes [2]:
SC (Subject Class) contains the pairs (p,c) where p is an object property and c
is its domain class.
SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype
property and c is its domain class.
OC (Object Class) contains the pairs (p,c) where p is an object property and c is
its range class.
ex:Sector foaf:Organization
sector1 ex:sector organization1
dc:title
“Energy” organization2
Extensional
Classes
Extensional
Knowledge
“Village electrification
in thePacific”
“+41331231”
ex:sector
rdf:type rdf:type
dbpedia:fax
person1
foaf:Person
ex:activity
“Paolo”
rdf:type
ex:ceo
rdf:type foaf:firstName
foaf:lastName “Rossi”
Indexes needed to generate a Schema Summary
These indexes belong to extensional group of the Statistical Indexes [2]:
SC (Subject Class) contains the pairs (p,c) where p is an object property and c
is its domain class.
SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype
property and c is its domain class.
OC (Object Class) contains the pairs (p,c) where p is an object property and c is
its range class.
ex:Sector foaf:Organization
sector1 organization1ex:sector
dc:title
“Energy” organization2
Extensional
Classes
Extensional
Knowledge
“Village electrification
in thePacific”
“+41331231”
ex:sector
rdf:type rdf:type
dbpedia:fax
person1
foaf:Person
ex:activity
“Paolo”
rdf:type
ex:ceo
rdf:type foaf:firstName
foaf:lastName “Rossi”
Indexes needed to generate a Schema Summary
These indexes belong to extensional group of the Statistical Indexes [2]:
SC (Subject Class) contains the pairs (p,c) where p is an object property and c
is its domain class.
SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype
property and c is its domain class.
OC (Object Class) contains the pairs (p,c) where p is an object property and c is
its range class.
ex:Sector foaf:Organization
sector1 ex:sector organization1
dc:title
“Energy” organization2
Extensional
Classes
Extensional
Knowledge
“Village electrification
in thePacific”
“+41331231”
ex:sector
rdf:type rdf:type
dbpedia:fax
person1
foaf:Person
ex:activity
“Paolo”
rdf:type
ex:ceo
rdf:type foaf:firstName
foaf:lastName “Rossi”
Schema Summary generation
We use an algorithm for combining these indexes and produce a Schema
Summary
Name Values
SC
(foaf:Organization,ex:ceo,1),
(foaf:Organization,ex:sector,2)
SCl
(foaf:Person,foaf:firstName,1),
(foaf:Person,foaf:lastName,1),
(foaf:Organization,ex:dbpedia:fax,1),
(ex:Sector,dc:title,1),
(foaf:Organization,ex:activity,1),
(foaf:Organization,dbpedia:fax,1)
OC
(ex:Sector,ex:sector,1)
(ex:Person,ex:ceo,1)
Schema Summary generation
foaf:Organizzation
2
ex:Sector
1
ex:sector 2foaf:Person
1
ex:ceo1
dc:title1foaf:firstName1
foaf:lastName 1
ex:activity1
dbpedia:fax1
We use an algorithm for combining these indexes and produce a Schema
Summary
Name Values
SC
(foaf:Organization,ex:ceo,1),
(foaf:Organization,ex:sector,2)
SCl
(foaf:Person,foaf:firstName,1),
(foaf:Person,foaf:lastName,1),
(foaf:Organization,ex:dbpedia:fax,1),
(ex:Sector,dc:title,1),
(foaf:Organization,ex:activity,1),
(foaf:Organization,dbpedia:fax,1)
OC
(ex:Sector,ex:sector,1)
(ex:Person,ex:ceo,1)
Visualization & Querying
Schema Summary Visualization
Front end of the Web Application composed by three panel:
List of datasets indexed in LODeX
Schema Summary and query building panel
Refinement panel
Query Orchestrator
It manages the interaction between the User and the GUI
It contains a SPARQL compiler able to compile the visual query in a
SPARQL one
Schema Summary – Building a Visual Query
Refinement Panel
Exercise 7 - LODeX
By using Lodexhttp://www.dbgroup.unimore.it/lodex2/ find the dataset
about World War 1
• What is the name of the dataset?
• How many classes it has? How many properties it has?
Visualize and explore the LODeX schema summary of this dataset
• How many instances does the class Water have?
• What are the incomming properties of the class Municipality?
Define a visual query that select a Dataset and its creator.
• What is the sparql query?
Exercise 8 – Linked Clean Energy Data
• Search the Linked Clean Energy Data and navigate its schema summary
• (http://www.dbgroup.unimore.it/lodex2/ok#!/schemaSummary/157)
• Create a visual query that select a Document and the Project Output
associated.
• For the Project Output, show the title and reference number.
• Run the query and look at the results and the SPARQL query.
• Try to perform the same query at the sparql endpoint you can find in DataHub
for the Linked Clean Energy Data
• (http://sparql.reeep.org/)
Querying LOD datasets
• SPARQL query
• On a SPARQL endpoint
• On a dump dataset
• Visual tools
Introduction to SPARQL
• SPARQL Query
• Declarative query language for RDF data
• http://www.w3.org/TR/rdf-sparql-query/
• SPARQL Algebra
• Standard for communication between SPARQL services and clients
• http://www.w3.org/2001/sw/DataAccess/rq23/rq24-algebra.html
• SPARQL Update
• Declarative manipulation language for RDF data
• http://www.w3.org/TR/sparql11-update/
• SPARQL Protocol
• Standard for communication between SPARQL services and clients
• http://www.w3.org/TR/sparql11-protocol/
SPARQL Basics
• RDF triple: Basic building block, of the form subject, predicate,
object. Example:
• RDF triple pattern: Contains one or more variables. Examples:
• RDF quad pattern: Contains graph name: URI or variable.
Examples:
dbpedia:The_Beatles foaf:name "The Beatles" .
dbpedia:The_Beatles foaf:made ?album.
?album mo:track ?track .
?album ?p ?o .
GRAPH <:g> {:s :p :o .}
GRAPH ?g {dbpedia:The_Beatles foaf:name ?o.}
SPARQL Basics
• RDF graph: Set of RDF assertions, manipulated as
a labeled directed graph.
• RDF data set: set of RDF triples. It is comprised of:
• One default graph
• Zero or more named graphs
• SPARQL protocol client: HTTP client that sends requests for SPARQL
Protocol operations (queries or updates)
• SPARQL protocol service: HTTP server that services requests for
SPARQL Protocol operations
• SPARQL endpoint: The URI at which a SPARQL Protocol service listens
for requests from SPARQL clients
Querying Linked Data with
SPARQL
SPARQL Query
Main idea: Pattern matching
• Queries describe sub-graphs of the queried graph
• Graph patterns are RDF graphs specified in Turtle syntax, which contain
variables (prefixed by either “?” or “$”)
• Sub-graphs that match the graph patterns yield a result
?albumdbpedia:
The_Beatles
foaf:made
SPARQL Query
?album
dbpedia:
The_Beatles
foaf:made
dbpedia:
The_Beatlesfoaf:made
<http://
musicbrainz.org
/record/...>
<http://
musicbrainz.org
/record/...>
foaf:made
Data:
Graph pattern:
Results:
"Help!" "Let It Be"
dc:title dc:title
<http://
musicbrainz.org
/record/...>
"Abbey Road"
dc:title
foaf:made
?album
<http://musicbrainz.org...>
<http://musicbrainz.org...>
<http://musicbrainz.org...>
SPARQL Query
?album
dbpedia:
The_Beatles
dbpedia:
The_Beatlesfoaf:made
<http://
musicbrainz.org
/record/...>
<http://
musicbrainz.org
/record/...>
foaf:made
Data:
Graph pattern:
Results:
"Help!" "Let It Be"
dc:title dc:title
<http://
musicbrainz.org
/record/...>
"Abbey Road"
dc:title
foaf:made
?album ?title
<http://...> "Help!"
<http://...> "Abbey Road"
<http://...> "Let It Be"
?title
dc:title
SPARQL Query
?album
dbpedia:
The_Beatles
dbpedia:
The_Beatlesfoaf:made
<http://
musicbrainz.org
/record/...>
<http://
musicbrainz.org
/track/...>
foaf:made
Data:
Graph pattern:
Results:
"Help!" "Help!"
dc:title dc:title
mo:track
a
mo:Record mo:Track
mo:Record
?album
<http://musicbrainz.org...>
SPARQL Query: Components
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album
FROM <http://musicbrainz.org/20130302>
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album a mo:Record ; dc:title ?title
}
ORDER BY ?title
Prologue:
• Prefix definitions
• Subtly different from Turtle syntax - the final period is not used
SPARQL Query: Components
Query form:
• ASK, SELECT, DESCRIBE or CONSTRUCT
• SELECT retrieves variables and their bindings as a table
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album
FROM <http://musicbrainz.org/20130302>
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album a mo:Record ; dc:title ?title
}
ORDER BY ?title
SPARQL Query: Components
Data set specification:
• This clause is optional
• FROM or FROM NAMED
• Indicates the sources for the data against which to find matches
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album
FROM <http://musicbrainz.org/20130302>
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album a mo:Record ; dc:title ?title
}
ORDER BY ?title
SPARQL Query: Components
Query pattern:
• Defines patterns to match against the data
• Generalises Turtle with variables and keywords – N.B. final period optional
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album
FROM <http://musicbrainz.org/20130302>
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album a mo:Record ; dc:title ?title
}
ORDER BY ?title
Solution modifier:
• Modify the result set
• ORDER BY, LIMIT or OFFSET re-organise rows;
• GROUP BY combines them
SPARQL Query: Components
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album
FROM <http://musicbrainz.org/20130302>
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album a mo:Record ; dc:title ?title
}
ORDER BY ?title
Query Forms
SPARQL supports different query forms:
• ASK tests whether or not a query pattern has a
solution. Returns yes/no
• SELECT returns variables and their bindings
directly
• CONSTRUCT returns a single RDF graph specified
by a graph template
• DESCRIBE returns a single RDF graph containing
RDF data about resource
Query Form: ASK
• Namespaces are added with the ‘PREFIX’ directive
• Statement patterns that make up the graph are
specified between brackets (“{}”)
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbpedia-ont: <http://dbpedia.org/ontology/>
PREFIX mo: http://purl.org/ontology/mo/
ASK WHERE { dbpedia:The_Beatles mo:member
dbpedia:Paul_McCartney.}
Is Paul McCartney member of ‘The Beatles’?Query:
true
Results:
Is Elvis Presley member of ‘The Beatles’?Query:
false
Results:
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbpedia-ont: <http://dbpedia.org/ontology/>
PREFIX mo: http://purl.org/ontology/mo/
ASK WHERE { dbpedia:The_Beatles mo:member
dbpedia:Elvis_Presley.}
Query Form: SELECT
• The solution modifier projection nominates which
components of the matches should be returned
• “*” means all components should be returned
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album_name ?track_title
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album dc:title ?album_name ;
mo:track ?track .
?track dc:title ?track_title .}
Query: What albums and tracks did ‘The Beatles’ make?
Filter expressions
• Different types of filters and functions may be used
Query Form: SELECT (2)
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album_name ?track_title ?date ?duration
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album dc:title ?album_name ;
mo:track ?track .
?track dc:title ?track_title ;
mo:duration ?duration;
FILTER (?duration>300000 && ?duration<400000) }
Query:
Filter: Comparison and logical operators
Retrieve the albums and tracks recorded by ‘The Beatles’, where the
duration of the song is more than 300 secs. and no longer than 400 secs.
Aggregates
• Calculate aggregate values: COUNT, SUM, MIN, MAX, AVG,
GROUP_CONCAT and SAMPLE
• Built around the GROUP BY operator
• Prune at group level (cf. FILTER) using HAVING
Query Form: SELECT (3)
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?album (SUM(?track_duration) AS ?album_duration)
WHERE {
dbpedia:The_Beatles foaf:made ?album .
?album mo:track ?track .
?track mo:duration ?track_duration .
} GROUP BY ?album
HAVING (SUM(?track_duration) > 3600000)
Retrieve the duration of the albums recorded by ‘The Beatles’.Query:
Exercise 9 – British Museum
• Find the British Museum Collection dataset.
• Find the related SPARQL endpoint
• Look at information about "The Rosetta Stone"
• http://collection.britishmuseum.org/sparql
Linked Data Publishing Platforms/Frameworks
• D2R Server: a tool for publishing relational databases as Linked Data
• Talis Platform: the Talis Platform provides Linked Data-compliant hosting
for content and RDF data
• Pubby: a Linked Data frontend for SPARQL Endpoints
• Paget: a framework for building Linked Data applications
• Linked Media Framework: a Linked Data server with updates and
semantic search
• PublishMyData: A Linked Data Publishing Platform run by Swirrl. RDF
data-hosting, Linked Data API, SPARQL endpoint and customisable
visualisations.

Weitere ähnliche Inhalte

Was ist angesagt?

euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
The State of Linked Government Data
The State of Linked Government DataThe State of Linked Government Data
The State of Linked Government DataRichard Cyganiak
 
Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityunivTope Omitola
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationStefan Dietze
 
Semantic Web / Linked Data Technologies
Semantic Web / Linked Data TechnologiesSemantic Web / Linked Data Technologies
Semantic Web / Linked Data TechnologiesMathieu d'Aquin
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebMathieu d'Aquin
 
From Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataFrom Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataDongpo Deng
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisMathieu d'Aquin
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterRobert H. McDonald
 
Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data Mathieu d'Aquin
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Mathieu d'Aquin
 

Was ist angesagt? (20)

euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Data hv seminar_thadthong_v05_slshr
Data hv seminar_thadthong_v05_slshrData hv seminar_thadthong_v05_slshr
Data hv seminar_thadthong_v05_slshr
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
The State of Linked Government Data
The State of Linked Government DataThe State of Linked Government Data
The State of Linked Government Data
 
Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityuniv
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & Education
 
Semantic Web / Linked Data Technologies
Semantic Web / Linked Data TechnologiesSemantic Web / Linked Data Technologies
Semantic Web / Linked Data Technologies
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic Web
 
From Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataFrom Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental Data
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 

Ähnlich wie Exploring Linked Open Data

OpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenDataOpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenDataroutetopa
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOpen Science Fair
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in RomaniaVlad Posea
 
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...News Leaders Association's NewsTrain
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Headstar
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactElena Simperl
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020Sarah Jones
 
Figshare for institutions presentation swets customer day 2014
Figshare for institutions   presentation swets customer day 2014Figshare for institutions   presentation swets customer day 2014
Figshare for institutions presentation swets customer day 2014Swetsbelgie
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 
20140410 ifla digitization workshop [idlc kuala lumpur]
20140410 ifla digitization workshop [idlc kuala lumpur]20140410 ifla digitization workshop [idlc kuala lumpur]
20140410 ifla digitization workshop [idlc kuala lumpur]Frederick Zarndt
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
Proposal for open government data
Proposal for open government dataProposal for open government data
Proposal for open government dataMahmoud Jalajel
 
Leveraging the dmp tool
Leveraging the dmp toolLeveraging the dmp tool
Leveraging the dmp toolBrian Zelip
 
Opening Up The BL's Metadata
Opening Up The BL's MetadataOpening Up The BL's Metadata
Opening Up The BL's Metadatanw13
 

Ähnlich wie Exploring Linked Open Data (20)

OpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenDataOpenDataCourse-04-HowToMakeOpenData
OpenDataCourse-04-HowToMakeOpenData
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitor
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...
Bringing a data mindset to your reporting - Brant Houston - Illinois NewsTrai...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
CISER & the Data Reference Interview
CISER & the Data Reference InterviewCISER & the Data Reference Interview
CISER & the Data Reference Interview
 
Figshare for institutions presentation swets customer day 2014
Figshare for institutions   presentation swets customer day 2014Figshare for institutions   presentation swets customer day 2014
Figshare for institutions presentation swets customer day 2014
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
20140410 ifla digitization workshop [idlc kuala lumpur]
20140410 ifla digitization workshop [idlc kuala lumpur]20140410 ifla digitization workshop [idlc kuala lumpur]
20140410 ifla digitization workshop [idlc kuala lumpur]
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
Proposal for open government data
Proposal for open government dataProposal for open government data
Proposal for open government data
 
Open Sesame: Open Data, Data Liberation and Opportunities for Librarians
Open Sesame: Open Data, Data Liberation and Opportunities for LibrariansOpen Sesame: Open Data, Data Liberation and Opportunities for Librarians
Open Sesame: Open Data, Data Liberation and Opportunities for Librarians
 
Leveraging the dmp tool
Leveraging the dmp toolLeveraging the dmp tool
Leveraging the dmp tool
 
Opening Up The BL's Metadata
Opening Up The BL's MetadataOpening Up The BL's Metadata
Opening Up The BL's Metadata
 

Mehr von Laura Po

Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...Laura Po
 
Big data analytics for smart and sustainable city galway
Big data analytics for smart and sustainable city galwayBig data analytics for smart and sustainable city galway
Big data analytics for smart and sustainable city galwayLaura Po
 
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazioneTRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazioneLaura Po
 
TRAFAIR - Premio PA sostenibile 2019
TRAFAIR - Premio PA sostenibile 2019TRAFAIR - Premio PA sostenibile 2019
TRAFAIR - Premio PA sostenibile 2019Laura Po
 
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...
Session 1 and 2  "Challenges and Opportunities with Big Linked Data Visualiza...Session 1 and 2  "Challenges and Opportunities with Big Linked Data Visualiza...
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...Laura Po
 
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...Laura Po
 
Building an urban theft map by analyzing newspaper - SMAP 2018
Building an urban theft map by analyzing newspaper - SMAP 2018Building an urban theft map by analyzing newspaper - SMAP 2018
Building an urban theft map by analyzing newspaper - SMAP 2018Laura Po
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolLaura Po
 
Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Laura Po
 
An iPad Order Management System for Fashion Trade
An iPad Order Management System for Fashion TradeAn iPad Order Management System for Fashion Trade
An iPad Order Management System for Fashion TradeLaura Po
 
A Non-Intrusive Movie Recommendation System
A Non-Intrusive Movie Recommendation SystemA Non-Intrusive Movie Recommendation System
A Non-Intrusive Movie Recommendation SystemLaura Po
 
A meta language for mdx queries in e log business
A meta language for mdx queries in e log businessA meta language for mdx queries in e log business
A meta language for mdx queries in e log businessLaura Po
 

Mehr von Laura Po (13)

Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
 
Big data analytics for smart and sustainable city galway
Big data analytics for smart and sustainable city galwayBig data analytics for smart and sustainable city galway
Big data analytics for smart and sustainable city galway
 
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazioneTRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
 
TRAFAIR - Premio PA sostenibile 2019
TRAFAIR - Premio PA sostenibile 2019TRAFAIR - Premio PA sostenibile 2019
TRAFAIR - Premio PA sostenibile 2019
 
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...
Session 1 and 2  "Challenges and Opportunities with Big Linked Data Visualiza...Session 1 and 2  "Challenges and Opportunities with Big Linked Data Visualiza...
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...
 
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
 
Building an urban theft map by analyzing newspaper - SMAP 2018
Building an urban theft map by analyzing newspaper - SMAP 2018Building an urban theft map by analyzing newspaper - SMAP 2018
Building an urban theft map by analyzing newspaper - SMAP 2018
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX tool
 
Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014
 
An iPad Order Management System for Fashion Trade
An iPad Order Management System for Fashion TradeAn iPad Order Management System for Fashion Trade
An iPad Order Management System for Fashion Trade
 
A Non-Intrusive Movie Recommendation System
A Non-Intrusive Movie Recommendation SystemA Non-Intrusive Movie Recommendation System
A Non-Intrusive Movie Recommendation System
 
A meta language for mdx queries in e log business
A meta language for mdx queries in e log businessA meta language for mdx queries in e log business
A meta language for mdx queries in e log business
 

Kürzlich hochgeladen

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Kürzlich hochgeladen (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Exploring Linked Open Data

  • 1. Exploration, Visualization and Querying of Linked Open Data sources 2nd Keystone Training School - Keyword Search in Big Linked Data Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS), University of Santiago de Compostela (USC), Spain. Laura Po Department of Engineering «Enzo Ferrari» University of Modena and Reggio Emilia Italy
  • 3. Outline • Introduction to Linked Open Data • Searching for LOD datasets • Exploring a dataset • Visualization tools • Querying a SPARQL Endpoint MORNING SESSION AFTERNOON HANDS-ON SESSION
  • 4.
  • 5. Searching for LOD datasets • Portals that collects datasets • Datahub – a portal that collect datasets • DataPortals.org- a portal that maintains a list of open data portals in the world • International or national open data portals • EU Open Data Portal is the single point of access to a wide range of data held by EU public administrations at all levels of government, agencies and other bodies, that allows access in all 24 EU official languages • European Union Open Data portal – the Open Data portal for the European Commission and other institutions of the European Union. • Popular Datasets • Wikidata - a collaboratively-created linked dataset that acts as central storage for the structured data of its Wikimedia sister projects • DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts described by 1 billion triples, including abstracts in 11 different languages • GeoNames provides RDF descriptions of more than 7,500,000 geographical features worldwide. • FOAF – a dataset describing persons, their properties and relationships
  • 6. DataHub • Datahub collects more than 10.000 datasets • It is a data management platform from the Open Knowledge Foundation, based on the CKAN data management system. • CKAN is a tool for managing and publishing collections of data. It is used by national and local governments, research institutions, and other organisations which collect a lot of data.
  • 7. Search on Datahub • To find datasets, type any combination of search words (e.g. “health”, “transport”, etc) in the search box on any page. CKAN displays the first page of results for your search. You can: • View more pages of results • Repeat the search, altering some terms • Restrict the search to datasets with particular tags, data formats, etc using the filters in the left-hand column • If datasets are tagged by geographical area, it is also possible to run CKAN with an extension which allows searching and filtering of datasets by selecting an area on a map.
  • 8. Exploring datasets • When you have found a dataset you are interested and selected it, CKAN will display the dataset page. This includes • The name, description, and other information about the dataset • Links to and brief descriptions of each of the resources • The resource descriptions link to a dedicated page for each resource. This resource page includes information about the resource, and enables it to be downloaded. • Many types of resource can also be previewed directly on the resource page. .CSV and .XLS spreadsheets are previewed in a grid view, with map and graph views also available if the data is suitable. The resource page will also preview resources if they are common image types, PDF, or HTML. • The dataset page also has two other tabs: • Activity stream – see the history of recent changes to the dataset • Related items – see any links to web pages related to this dataset, or add your own links.
  • 9. Exercise 1 – Datahub • Find the dbpedia dataset in Datahub • Look at the possible way the dataset can be accessed • Find datasets about Santiago • Can you find some interesting data source? • How much information are given on these data? • How many formats and access points to the datasets are available?
  • 10. International or national open data portals • The European Data Portal harvests the metadata of Public Sector Information available on public data portals across European countries. Information regarding the provision of data and the benefits of re-using data is also included. • Public sector information is information held by the public sector. The Directive on the re-use of public sector information provides a common legal framework for a European market for government-held data.
  • 11. Improving the accessibility and Value of OGD • The strategic objective of the European Data Portal is to improve accessibility and increase the value of Open Governament Data: • Accessibility: How to access this information? Where to find it? How to make it available in the first place? In domains, across domains, across countries? In what language? • Value: For what purpose and what economic gain? Societal gain? Democratic gain? In what format? What is the critical mass? • The European Data Portal addresses the whole data value chain: from data publishing to data re-use.
  • 12. A checklist for using Open Data Having access to data is a first step. Data is not an end in itself. Data can be used in different ways and for different purposes. Data can also be available with different licences, formats and quality. • Define your purpose: You might specify a topic or a service or an application of interest • Identify data labels: Filter the data labels and metadata • Check Openness: Take a look at the licence information. Make sure a licence is available which allows you to make use of the data in the way that you intend (e.g. that commercial re-use is allowed if you develop a commercial application). After you have decided that a specific data set is exactly what you are looking for • Select the useful file format - you are probably able to choose to download the datasets in different file formats. Depending on your computer skills, you can choose the file type that is most appropriate. Most datasets are available in an open file format. • Check the data quality – check the last date the file was modified, check whether information about the time period is provided.
  • 13. A checklist for using Open Data Having access to data is a first step. Data is not an end in itself. Data can be used in different ways and for different purposes. Data can also be available with different licences, formats and quality. • Define your purpose: You might specify a topic or a service or an application of interest • Identify data labels: Filter the data labels and metadata • Check Openness: Take a look at the licence information. Make sure a licence is available which allows you to make use of the data in the way that you intend (e.g. that commercial re-use is allowed if you develop a commercial application). After you have decided that a specific data set is exactly what you are looking for • Select the useful file format - you are probably able to choose to download the datasets in different file formats. Depending on your computer skills, you can choose the file type that is most appropriate. Most datasets are available in an open file format. • Check the data quality – check the last date the file was modified, check whether information about the time period is provided. • Form • how has the data been processed? • is it in raw or summary form? • how will its form affect your analysis/product/application? • what syntactic (language) and semantic (meaning) transformations will you need to make? • is this compatible with other datasets you have? • Quality • how current is the data? • how regularly is it updated? • do you understand all the fields and their context? • for how long will it be published? what is the commitment by the publisher? • what do you know about the accuracy of the data? • how are missing data handled?
  • 14. Exercise 2 – EDP • Choose one category on the EDP and find datasets describing one specific topic (for example in the category of transport the topic could be cycling routes) General analysis • How many datasets are available? • What are the main datasets? Local analysis • How many information about your country are available? • How many formats and access points to the datasets are available?
  • 15. National open data portals • Notable examples of Open Data portals maintained by public administrations in Europe are: • France • opendata.paris.fr • www.data.gouv.fr • Italy • www.dati.piemonte.it • www.dati.gov.it • Netherland • www.data.overheid.nl • UK • data.gov.uk
  • 16. Open data websites in Europe International publicdata.eu data.un.org data.worldbank.org EU Member States data.gov.be opendata.government.bg opendata.cz portal.opendata.dk govdata.de opendata.ee data.gov.ie data.gov.gr datos.gob.es data.gouv.fr data.gov.hr dati.gov.it data.gov.cy opendata.gov.lt data.public.lu data.gov.mt data.overheid.nl data.gv.at danepubliczne.gov.pl dados.gov.pt data.gov.ro nio.gov.si/nio/ data.gov.sk avoindata.fi oppnadata.se data.gov.uk
  • 17. Exercise 3 – Find national Open Data portals • Find the government open data portal from your member state …some suggestions: • Search in the list of EU member states open data websites • Search in DataCatalogs • Search in Google • How many portals that collect open data are available in your member state? • How many datasets are collected in the portals? • Have you already used some of these data?
  • 18. Ranking of the national open data datasets Global Open Data Index – is an annual report to measure the state of open government data around the world. The goal is to provide a civil society audit of how governments actually publish data - with input and review from citizens and organisations around the world. • Topical experts review datasets from different country, establish a baseline and track changes and trends in the open data world over time as the field evolves. Open Data Barometer - aims to uncover the impact of open data initiatives around the world. It analyses global trends, and provides comparative data on countries and regions via an in-depth methodology combining contextual data, technical assessments and secondary indicators to explore multiple dimensions of open data readiness, implementation and impact. • This is the second edition of the Open Data Barometer. The Open Data Barometer forms part of the World Wide Web Foundation’s work on common assessment methods for open data.
  • 19. Exercise 4 – Find the ranking • By using the information on Open Data Barometer and Global Open Data index, find the ranking for the Open Data Portals/Initiatives of your member state Some questions • Does the National Maps have an open lincense? • Is the Government Budget publicly available? • How are the Government policies of your country compared to the mean of Europe and Central Asia? • How high/low is the impact of open data in your country?
  • 20.
  • 21. Exploring the Web of Data • Linked Data Browsers - generic Linked Data browsers which allow users to start browsing in one data source and then navigate along links into related data sources • OpenLink Data Explorer a Web browser extension, and a server-side component of the OpenLink Ajax Toolkit. • Marbles tabular Linked Data browser supporting Fresnel. • Sigma, Live views on the Web of Data • Quick & Dirty RDF Browser Simple RDF browser. Useful for checking RDF or RDFa says what you intended. • Graphity Client Generic Linked Data browser and platform for building declarative SPARQL triplestore- backed Web applications. Apache license. • Linked Data mashups • Revyu by Tom Heath. Uses Linked Data from DBpedia to augment reviews, for instance with information about a director for a film. • DBpedia Mobile by Christian Becker and Chris Bizer. Combines Linked Data from DBpedia, the flickr wrapper, and Revyu. • Music Mashup by Yves Raimond. Combines Linked Data from various music related data sources. • Linked Data Search engines - crawl the Web of Data by following links between data sources and provide expressive query capabilities over aggregated data
  • 24. Linked Data Mashup DBPedia Mobile Pictures from revyu.com http://wiki.dbpedia.org/DBPediaMobile
  • 26. Linked Data Search Engines http://data.nytimes.com/schools/schools.html NYTimes
  • 29. Some Application Scenarios LinkedGeoData.org LinkedGeoData adds a spatial dimension to the Web of Data / Semantic Web. LinkedGeoData uses the information collected by the OpenStreetMap project and makes it available as an RDF knowledge base according to the Linked Data principles. It interlinks this data with other knowledge bases in the Linking Open Data initiative.
  • 30. Exercise 5 – OpenLink exploration - Facebook to Linked Data Transformation Examples • Install the OpenLink Data Explorer (ODE) extension for your browser (currently available for Firefox, Safari, Chrome, Opera, and Internet Explorer) • This extension will allow you to explore the raw data and entity relationships that underlay the Web resources it processes. • Select your Facebook Profile Page (or another person Facebook Profile page) • Right-Click (or Ctrl-Click on Mac) on the page and then click on "View Page Description" to obtain a descriptions of the resources available on the linked page • A description of the resource Metadata available on the page is displayed. • More example: If you what to perform some other researches look at the example page
  • 31.
  • 32. Visualization of Linked Data • Why is it important? • Actually the consumption of LOD is restricted to the Semantic Web community • Visual tools that provide a coherent and legible picture of the data allow also non-technical audience • to obtain a good understanding of the data structure, • and to compose query, • identify links between resources • and intuitively discover new pieces of information
  • 33. What is visualization • The visualization of information • Goals: • Effective communication of information • Clarity • Integrity (all the information) • Stimulate viewer engagment • Focus on effectiveness
  • 34. Why is visualization important? • With lage datasets we need an efficint way to understand a vast amount of data • The human visual system is the highest- bandwith channel to the human brain
  • 35. Why visualize data instead of provide statistic analysis? http://en.wikipedia.org/wiki/Anscombe's_quartet • Anscombe's quartet of datasets having similar statistical properties but appearing very different when plotted
  • 36. Example of the Linked Data visualization process
  • 37. Heatmap visualization of Beatles releases
  • 39. LOD live LodLive project provides a demonstration of the use of Linked Data standards (RDF, SPARQL) to browse RDF resources. The application aims to spread linked data principles using a simple and friendly interface with reusable techniques. http://en.lodlive.it/ http://en.lodlive.it/?http://dbpedia.org/resource/Jules_Verne
  • 40.
  • 41. Exercise 6 - LODLive By using LodLive online to explore dbpedia resources, search for Serena Williams http://en.lodlive.it/ - who is she? - where does she live? - where does she is list as a champion actually (find and explore the "currentChampion" relation)? - find the statistics and records associated to her, navigate to the wikipedia page, and discover what is the total win rate of Serena in Single disciplines
  • 42. Visualbox Visualbox allows you to create visualizations based on Linked Open Data. The goal of Visualbox is to facilitate the creation of visualization without the need to learn Javascript libraries. You do need to know a bit of SPARQL and some notions of HTML though. Visualbox is a simplified version of LODSPeaKr, a framework to create Linked Data-based applications. http://orion.tw.rpi.edu/~agraves/mozfest/index.html
  • 43. VisualBox – some example http://orion.tw.rpi.edu/~agraves/mozfest/action http://orion.tw.rpi.edu/~agraves/mozfest/firesock_test
  • 44. LODEXIt is a tool for producing a representative summary of a Linked open Data (LOD) source starting from scratch, thus supporting users in exploring and understanding the contents of a dataset. LODeX extracts statistical indexes that uses to build the representative summary, by quering the SPARQL endpoint of a LOD source. Two online versions: • LODeX 2.0 (http://www.dbgroup.unimo.it/lodex2 ) includes the possibility to compose visual queries by selecting objects from the representative summary of a LOD source • LODeX Cluster (http://www.dbgroup.unimo.it/lodex2/testCluster ) provides a more concise schema for huge datasets
  • 45. LODeX Architecture Two main modules • Extraction & Summarization – Index Extraction (IE) – Post Processing (PP) LOD Cloud SPARQL Queries LODeX Post- processing Statistical Indexes LODeX Indexes Extraction Endpoint URLs Schema Summary NoSQL SPARQL Queries Schema Summary Query Orchestrator Schema Summary Visualizzation Basic QueryResults • Visualization & Querying – Schema Summary Visualization – Query Orchestrator
  • 46. The Schema Summary is a pseudograph composed by: C - Classes (nodes) P - Properties (edges) And additional elements and function: A - Attributes associated to each class Each attribute represent the existence of a Datatype property from the instances of the class σ 𝒍 -labels l – labeling function count - count function The Schema Summary is inferred by the distribution of the instances of a dataset The Schema summary
  • 47. A running example ex:Sector foaf:Organization owl:Class ex:sector “sector” rdf:type rdf:type rdf:Propertyrdf:type owl:ObjectProperty rdf:type sector1 organization1ex:sector dc:title “Energy” Extensional Classes Extensional Knowledge Intensional Knowledge ex:activity “Village electrification in the Pacific” organization2 “+41331231” rdfs:label rdfs:label rdfs:domain rdf:type ex:sector rdf:type rdf:type dbpedia:fax person1 foaf:Person ex:activity “Paolo” rdf:type ex:ceo rdf:type foaf:firstName foaf:lastName “Rossi” The information contained in the Intensional knowledge can be incomplete or absent
  • 48. Indexes needed to generate a Schema Summary These indexes belong to extensional group of the Statistical Indexes [2]: SC (Subject Class) contains the pairs (p,c) where p is an object property and c is its domain class. SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype property and c is its domain class. OC (Object Class) contains the pairs (p,c) where p is an object property and c is its range class. ex:Sector foaf:Organization sector1 ex:sector organization1 dc:title “Energy” organization2 Extensional Classes Extensional Knowledge “Village electrification in thePacific” “+41331231” ex:sector rdf:type rdf:type dbpedia:fax person1 foaf:Person ex:activity “Paolo” rdf:type ex:ceo rdf:type foaf:firstName foaf:lastName “Rossi”
  • 49. Indexes needed to generate a Schema Summary These indexes belong to extensional group of the Statistical Indexes [2]: SC (Subject Class) contains the pairs (p,c) where p is an object property and c is its domain class. SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype property and c is its domain class. OC (Object Class) contains the pairs (p,c) where p is an object property and c is its range class. ex:Sector foaf:Organization sector1 organization1ex:sector dc:title “Energy” organization2 Extensional Classes Extensional Knowledge “Village electrification in thePacific” “+41331231” ex:sector rdf:type rdf:type dbpedia:fax person1 foaf:Person ex:activity “Paolo” rdf:type ex:ceo rdf:type foaf:firstName foaf:lastName “Rossi”
  • 50. Indexes needed to generate a Schema Summary These indexes belong to extensional group of the Statistical Indexes [2]: SC (Subject Class) contains the pairs (p,c) where p is an object property and c is its domain class. SCl (Subject Class to literal) contains the pairs (p,c) where p is a datatype property and c is its domain class. OC (Object Class) contains the pairs (p,c) where p is an object property and c is its range class. ex:Sector foaf:Organization sector1 ex:sector organization1 dc:title “Energy” organization2 Extensional Classes Extensional Knowledge “Village electrification in thePacific” “+41331231” ex:sector rdf:type rdf:type dbpedia:fax person1 foaf:Person ex:activity “Paolo” rdf:type ex:ceo rdf:type foaf:firstName foaf:lastName “Rossi”
  • 51. Schema Summary generation We use an algorithm for combining these indexes and produce a Schema Summary Name Values SC (foaf:Organization,ex:ceo,1), (foaf:Organization,ex:sector,2) SCl (foaf:Person,foaf:firstName,1), (foaf:Person,foaf:lastName,1), (foaf:Organization,ex:dbpedia:fax,1), (ex:Sector,dc:title,1), (foaf:Organization,ex:activity,1), (foaf:Organization,dbpedia:fax,1) OC (ex:Sector,ex:sector,1) (ex:Person,ex:ceo,1)
  • 52. Schema Summary generation foaf:Organizzation 2 ex:Sector 1 ex:sector 2foaf:Person 1 ex:ceo1 dc:title1foaf:firstName1 foaf:lastName 1 ex:activity1 dbpedia:fax1 We use an algorithm for combining these indexes and produce a Schema Summary Name Values SC (foaf:Organization,ex:ceo,1), (foaf:Organization,ex:sector,2) SCl (foaf:Person,foaf:firstName,1), (foaf:Person,foaf:lastName,1), (foaf:Organization,ex:dbpedia:fax,1), (ex:Sector,dc:title,1), (foaf:Organization,ex:activity,1), (foaf:Organization,dbpedia:fax,1) OC (ex:Sector,ex:sector,1) (ex:Person,ex:ceo,1)
  • 53. Visualization & Querying Schema Summary Visualization Front end of the Web Application composed by three panel: List of datasets indexed in LODeX Schema Summary and query building panel Refinement panel Query Orchestrator It manages the interaction between the User and the GUI It contains a SPARQL compiler able to compile the visual query in a SPARQL one
  • 54. Schema Summary – Building a Visual Query
  • 56. Exercise 7 - LODeX By using Lodexhttp://www.dbgroup.unimore.it/lodex2/ find the dataset about World War 1 • What is the name of the dataset? • How many classes it has? How many properties it has? Visualize and explore the LODeX schema summary of this dataset • How many instances does the class Water have? • What are the incomming properties of the class Municipality? Define a visual query that select a Dataset and its creator. • What is the sparql query?
  • 57. Exercise 8 – Linked Clean Energy Data • Search the Linked Clean Energy Data and navigate its schema summary • (http://www.dbgroup.unimore.it/lodex2/ok#!/schemaSummary/157) • Create a visual query that select a Document and the Project Output associated. • For the Project Output, show the title and reference number. • Run the query and look at the results and the SPARQL query. • Try to perform the same query at the sparql endpoint you can find in DataHub for the Linked Clean Energy Data • (http://sparql.reeep.org/)
  • 58.
  • 59. Querying LOD datasets • SPARQL query • On a SPARQL endpoint • On a dump dataset • Visual tools
  • 60. Introduction to SPARQL • SPARQL Query • Declarative query language for RDF data • http://www.w3.org/TR/rdf-sparql-query/ • SPARQL Algebra • Standard for communication between SPARQL services and clients • http://www.w3.org/2001/sw/DataAccess/rq23/rq24-algebra.html • SPARQL Update • Declarative manipulation language for RDF data • http://www.w3.org/TR/sparql11-update/ • SPARQL Protocol • Standard for communication between SPARQL services and clients • http://www.w3.org/TR/sparql11-protocol/
  • 61. SPARQL Basics • RDF triple: Basic building block, of the form subject, predicate, object. Example: • RDF triple pattern: Contains one or more variables. Examples: • RDF quad pattern: Contains graph name: URI or variable. Examples: dbpedia:The_Beatles foaf:name "The Beatles" . dbpedia:The_Beatles foaf:made ?album. ?album mo:track ?track . ?album ?p ?o . GRAPH <:g> {:s :p :o .} GRAPH ?g {dbpedia:The_Beatles foaf:name ?o.}
  • 62. SPARQL Basics • RDF graph: Set of RDF assertions, manipulated as a labeled directed graph. • RDF data set: set of RDF triples. It is comprised of: • One default graph • Zero or more named graphs • SPARQL protocol client: HTTP client that sends requests for SPARQL Protocol operations (queries or updates) • SPARQL protocol service: HTTP server that services requests for SPARQL Protocol operations • SPARQL endpoint: The URI at which a SPARQL Protocol service listens for requests from SPARQL clients
  • 63. Querying Linked Data with SPARQL
  • 64. SPARQL Query Main idea: Pattern matching • Queries describe sub-graphs of the queried graph • Graph patterns are RDF graphs specified in Turtle syntax, which contain variables (prefixed by either “?” or “$”) • Sub-graphs that match the graph patterns yield a result ?albumdbpedia: The_Beatles foaf:made
  • 65. SPARQL Query ?album dbpedia: The_Beatles foaf:made dbpedia: The_Beatlesfoaf:made <http:// musicbrainz.org /record/...> <http:// musicbrainz.org /record/...> foaf:made Data: Graph pattern: Results: "Help!" "Let It Be" dc:title dc:title <http:// musicbrainz.org /record/...> "Abbey Road" dc:title foaf:made ?album <http://musicbrainz.org...> <http://musicbrainz.org...> <http://musicbrainz.org...>
  • 66. SPARQL Query ?album dbpedia: The_Beatles dbpedia: The_Beatlesfoaf:made <http:// musicbrainz.org /record/...> <http:// musicbrainz.org /record/...> foaf:made Data: Graph pattern: Results: "Help!" "Let It Be" dc:title dc:title <http:// musicbrainz.org /record/...> "Abbey Road" dc:title foaf:made ?album ?title <http://...> "Help!" <http://...> "Abbey Road" <http://...> "Let It Be" ?title dc:title
  • 68. SPARQL Query: Components PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title Prologue: • Prefix definitions • Subtly different from Turtle syntax - the final period is not used
  • 69. SPARQL Query: Components Query form: • ASK, SELECT, DESCRIBE or CONSTRUCT • SELECT retrieves variables and their bindings as a table PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
  • 70. SPARQL Query: Components Data set specification: • This clause is optional • FROM or FROM NAMED • Indicates the sources for the data against which to find matches PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
  • 71. SPARQL Query: Components Query pattern: • Defines patterns to match against the data • Generalises Turtle with variables and keywords – N.B. final period optional PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
  • 72. Solution modifier: • Modify the result set • ORDER BY, LIMIT or OFFSET re-organise rows; • GROUP BY combines them SPARQL Query: Components PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album FROM <http://musicbrainz.org/20130302> WHERE { dbpedia:The_Beatles foaf:made ?album . ?album a mo:Record ; dc:title ?title } ORDER BY ?title
  • 73. Query Forms SPARQL supports different query forms: • ASK tests whether or not a query pattern has a solution. Returns yes/no • SELECT returns variables and their bindings directly • CONSTRUCT returns a single RDF graph specified by a graph template • DESCRIBE returns a single RDF graph containing RDF data about resource
  • 74. Query Form: ASK • Namespaces are added with the ‘PREFIX’ directive • Statement patterns that make up the graph are specified between brackets (“{}”) PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-ont: <http://dbpedia.org/ontology/> PREFIX mo: http://purl.org/ontology/mo/ ASK WHERE { dbpedia:The_Beatles mo:member dbpedia:Paul_McCartney.} Is Paul McCartney member of ‘The Beatles’?Query: true Results: Is Elvis Presley member of ‘The Beatles’?Query: false Results: PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX dbpedia-ont: <http://dbpedia.org/ontology/> PREFIX mo: http://purl.org/ontology/mo/ ASK WHERE { dbpedia:The_Beatles mo:member dbpedia:Elvis_Presley.}
  • 75. Query Form: SELECT • The solution modifier projection nominates which components of the matches should be returned • “*” means all components should be returned PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album_name ?track_title WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name ; mo:track ?track . ?track dc:title ?track_title .} Query: What albums and tracks did ‘The Beatles’ make?
  • 76. Filter expressions • Different types of filters and functions may be used Query Form: SELECT (2) PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album_name ?track_title ?date ?duration WHERE { dbpedia:The_Beatles foaf:made ?album . ?album dc:title ?album_name ; mo:track ?track . ?track dc:title ?track_title ; mo:duration ?duration; FILTER (?duration>300000 && ?duration<400000) } Query: Filter: Comparison and logical operators Retrieve the albums and tracks recorded by ‘The Beatles’, where the duration of the song is more than 300 secs. and no longer than 400 secs.
  • 77. Aggregates • Calculate aggregate values: COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT and SAMPLE • Built around the GROUP BY operator • Prune at group level (cf. FILTER) using HAVING Query Form: SELECT (3) PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX mo: <http://purl.org/ontology/mo/> SELECT ?album (SUM(?track_duration) AS ?album_duration) WHERE { dbpedia:The_Beatles foaf:made ?album . ?album mo:track ?track . ?track mo:duration ?track_duration . } GROUP BY ?album HAVING (SUM(?track_duration) > 3600000) Retrieve the duration of the albums recorded by ‘The Beatles’.Query:
  • 78. Exercise 9 – British Museum • Find the British Museum Collection dataset. • Find the related SPARQL endpoint • Look at information about "The Rosetta Stone" • http://collection.britishmuseum.org/sparql
  • 79.
  • 80. Linked Data Publishing Platforms/Frameworks • D2R Server: a tool for publishing relational databases as Linked Data • Talis Platform: the Talis Platform provides Linked Data-compliant hosting for content and RDF data • Pubby: a Linked Data frontend for SPARQL Endpoints • Paget: a framework for building Linked Data applications • Linked Media Framework: a Linked Data server with updates and semantic search • PublishMyData: A Linked Data Publishing Platform run by Swirrl. RDF data-hosting, Linked Data API, SPARQL endpoint and customisable visualisations.