1h SPARQL tutorial given at the "Practical Cross-Dataset Queries on the Web of Data" tutorial at WWW2012. Supported by the LATC FP7 Project. http://latc-project.eu/
Boost PC performance: How more available memory can improve productivity
SPARQL - Basic and Federated Queries
1. SPARQL
-
Query Basics and Federated Queries
Knud Möller, Talis Systems Ltd.
@knudmoeller
16 April 2012
Tutorial: Practical Cross-Dataset Queries on the Web of Data
WWW2012, Lyon, France
2. What is SPARQL?
• query language for RDF graphs (i.e., linked
data)
• get specific information out of a dataset (or
several datasets)
• “The SQL for the Web of Data”
LQRA P S
3. History
• 2004: work started on SPARQL
• 2008: SPARQL 1.0 finished (W3C Rec)
• 2009: work started on SPARQL 1.1
• 2012 (now): SPARQL 1.1 almost finished
• soon: SPARQL 1.1 finished
6. SPARQL in a Nutshell
1 query clause
2 WHERE clause
7. SPARQL in a Nutshell
What do I want to get back?
• SELECT: list of things
• DESCRIBE: RDF graph for one
resource
• CONSTRUCT: custom RDF graph
1 query clause • ASK: boolean value (true or false)
2 WHERE clause
8. SPARQL in a Nutshell
What do I want to get back?
• SELECT: list of things
• DESCRIBE: RDF graph for one
resource
• CONSTRUCT: custom RDF graph
1 query clause • ASK: boolean value (true or false)
2 WHERE clause Which part of the graph am I looking for?
• WHERE
• triple patterns
• graph patterns
9. SPARQL in a Nutshell
What do I want to get back?
• SELECT: list of things
• DESCRIBE: RDF graph for one
0 Prologue resource
• CONSTRUCT: custom RDF graph
1 query clause • ASK: boolean value (true or false)
2 WHERE clause Which part of the graph am I looking for?
• WHERE
• triple patterns
• graph patterns
10. SPARQL in a Nutshell
What do I want to get back?
• SELECT: list of things
• DESCRIBE: RDF graph for one
0 Prologue resource
• CONSTRUCT: custom RDF graph
1 query clause • ASK: boolean value (true or false)
2 WHERE clause Which part of the graph am I looking for?
• WHERE
3 Modifiers • triple patterns
• graph patterns
17. Example Data
• from music site Jamendo:
http://ww.jamendo.com
• linked data at DBTune:
http://dbtune.org/jamendo/
• thousands of artists with albums, tracks, etc.
• additionally, some geographical data from
Geonames:
http://www.geonames.org/
55. ARQ
we will use the arq command line tool - see the
ARQ Command Line Cheat Sheet:
arq --data data/jamendo.nt
--query queries/query5.rq
56. Example Scenario
• Get all albums for a particular month (e.g., July 2007)
• List all Jamendo artists in France
• explore datasets and build final query bit by bit
• learn about different features of SPARQL along the
way
• cannot cover every aspect of SPARQL! E.g., property
paths are missing.
57. List all Classes
SELECT DISTINCT ?class
WHERE {
! ?s a ?class .
}
find all types of things in the dataset (classes)
58. List all Classes
SELECT DISTINCT ?class
WHERE {
! ?s a ?class .
}
find all types of things in the dataset (classes)
--------------------------------------------------------
| class |
========================================================
| <http://purl.org/ontology/mo/Playlist> |
| <http://purl.org/ontology/mo/Signal> |
| <http://purl.org/ontology/mo/Lyrics> |
| <http://purl.org/ontology/mo/Track> |
| <http://www.w3.org/2006/time#Interval> |
| <http://purl.org/ontology/mo/Torrent> |
| <http://purl.org/ontology/mo/ED2K> |
| <http://purl.org/ontology/mo/Record> |
| <http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tag> |
| <http://purl.org/ontology/mo/MusicArtist> |
| <http://xmlns.com/foaf/0.1/Document> |
--------------------------------------------------------
59. List all Classes
SELECT DISTINCT ?class
WHERE {
! ?s a ?class .
}
find all types of things in the dataset (classes)
--------------------------------------------------------
| class |
========================================================
| <http://purl.org/ontology/mo/Playlist> |
| <http://purl.org/ontology/mo/Signal> |
| <http://purl.org/ontology/mo/Lyrics> |
| <http://purl.org/ontology/mo/Track> |
| <http://www.w3.org/2006/time#Interval> |
| <http://purl.org/ontology/mo/Torrent> |
| <http://purl.org/ontology/mo/ED2K> |
| <http://purl.org/ontology/mo/Record> |
| <http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tag> |
| <http://purl.org/ontology/mo/MusicArtist> |
| <http://xmlns.com/foaf/0.1/Document> |
--------------------------------------------------------
60. Properties of Artists
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT DISTINCT ?pred
WHERE {
! ?artist a mo:MusicArtist ;
! ! ?pred ?obj .
}
all properties for the mo:MusicArtist class
61. Properties of Artists
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT DISTINCT ?pred
WHERE {
! ?artist a mo:MusicArtist ;
! ! ?pred ?obj .
}
all properties for the mo:MusicArtist class
-----------------------------------------------------
| pred |
=====================================================
| <http://xmlns.com/foaf/0.1/name> |
| <http://xmlns.com/foaf/0.1/made> |
| <http://xmlns.com/foaf/0.1/img> |
| <http://xmlns.com/foaf/0.1/based_near> |
| <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> |
| <http://xmlns.com/foaf/0.1/homepage> |
| mo:biography |
-----------------------------------------------------
62. Properties of Artists, ctd.
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT DISTINCT ?class
WHERE {
! ?artist a mo:MusicArtist ;
! ! ?pred ?obj .
! ?obj a ?class ;
}
properties with objects
63. Properties of Artists, ctd.
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT DISTINCT ?class
WHERE {
! ?artist a mo:MusicArtist ;
! ! ?pred ?obj .
! ?obj a ?class ;
}
properties with objects
------------------------------------------------
| pred | class |
================================================
| <http://xmlns.com/foaf/0.1/made> | mo:Record |
------------------------------------------------
only one result?
64. Properties of Artists, ctd.
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT DISTINCT ?class
WHERE {
! ?artist a mo:MusicArtist ;
! ! ?pred ?obj .
! OPTIONAL {
! ! ?obj a ?class .
! }
OPTIONAL statement
}
73. Problem: Wrong data format
• date/time related functions in SPARQL expect
dates in xsd:dateTime format:
"2007-06-15T17:13:58"^^xsd:dateTime
• however, source data has dates as plain literals
with wrong format:
"2007-06-15 17:13:58"
103. Et voilà - Les artistes de France!
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! geonames:name ?place_name .
}
ORDER BY ?artist_name
104. Et voilà - Les artistes de France!
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! geonames:name ?place_name .
}
ORDER BY ?artist_name
----------------------------------------------------------------------------------------------------------------
| artist_name | place_name |
================================================================================================================
| "#2 Orchestra"^^<http://www.w3.org/2001/XMLSchema#string> | "Republic of France" |
| "#Blockout"^^<http://www.w3.org/2001/XMLSchema#string> | "Département des Yvelines" |
| "#Dance 75#"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "#NarNaoud#"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de la Gironde" |
| "#ZedMeta#"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "#Zorglups#"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de l'Essonne" |
| "&ND"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "(own+line)"^^<http://www.w3.org/2001/XMLSchema#string> | "Republic of France" |
| "* Q u i r y *"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "-=Kwada=-"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de la Haute-Garonne" |
| "-DEMO-"^^<http://www.w3.org/2001/XMLSchema#string> | "Département du Nord" |
105. Et voilà - Les artistes de France!
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! geonames:name ?place_name .
}
ORDER BY ?artist_name
----------------------------------------------------------------------------------------------------------------
| artist_name | place_name |
================================================================================================================
| "#2 Orchestra"^^<http://www.w3.org/2001/XMLSchema#string> | "Republic of France" |
| "#Blockout"^^<http://www.w3.org/2001/XMLSchema#string> | "Département des Yvelines" |
| "#Dance 75#"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "#NarNaoud#"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de la Gironde" |
| "#ZedMeta#"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "#Zorglups#"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de l'Essonne" |
| "&ND"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "(own+line)"^^<http://www.w3.org/2001/XMLSchema#string> | "Republic of France" |
| "* Q u i r y *"^^<http://www.w3.org/2001/XMLSchema#string> | "Paris" |
| "-=Kwada=-"^^<http://www.w3.org/2001/XMLSchema#string> | "Département de la Haute-Garonne" |
| "-DEMO-"^^<http://www.w3.org/2001/XMLSchema#string> | "Département du Nord" |
106. One more thing...
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM NAMED <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! GRAPH <../data/in_france.nt> {
! ! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! ! geonames:name ?place_name .
! }
}
ORDER BY ?artist_name
107. One more thing...
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM NAMED <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! GRAPH <../data/in_france.nt> {
! ! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! ! geonames:name ?place_name .
! }
}
ORDER BY ?artist_name
108. One more thing...
PREFIX foaf: ! ! <http://xmlns.com/foaf/0.1/>
PREFIX geonames: ! <http://www.geonames.org/ontology#>
PREFIX mo: ! ! <http://purl.org/ontology/mo/>
SELECT DISTINCT ?artist_name ?place_name
FROM <../data/jamendo-rdf/jamendo.nt>
FROM NAMED <../data/in_france.nt>
WHERE {
! ?artist a mo:MusicArtist ;
! ! foaf:based_near ?place ;
! ! foaf:name ?artist_name .
! GRAPH <../data/in_france.nt> {
! ! ?place geonames:parentCountry <http://sws.geonames.org/3017382/> ;
! ! ! geonames:name ?place_name .
! }
}
ORDER BY ?artist_name
109.
110.
111. One last thing...
PREFIX swrc: <http://swrc.ontoware.org/ontology#>
PREFIX org: <http://data.semanticweb.org/organization/>
SELECT ?affiliation ?conference_graph
WHERE {
! SERVICE <http://data.semanticweb.org/sparql> {
! ! GRAPH ?conference_graph {
! ! ! <http://data.semanticweb.org/person/richard-cyganiak> swrc:affiliation ?affiliation .
! ! }
! }
}
112. One last thing...
PREFIX swrc: <http://swrc.ontoware.org/ontology#>
PREFIX org: <http://data.semanticweb.org/organization/>
SELECT ?affiliation ?conference_graph
WHERE {
! SERVICE <http://data.semanticweb.org/sparql> {
! ! GRAPH ?conference_graph {
! ! ! <http://data.semanticweb.org/person/richard-cyganiak> swrc:affiliation ?affiliation .
! ! }
! }
}
Hinweis der Redaktion
\n
- &#x201C;SQL for the Semantic Web&#x201D; - not sure who actually said that first, but it&#x2019;s being repeated a lot\n- people who know a lot about SQL often say that SPARQL is actually not like SQL at all, but at least it looks a bit similar and uses similar keywords\n- \n
- just some quick historical fun facts about SPARQL\n- work started some time in 2004, and in 2008 SPARQL 1.0 was released as a W3C recommendation\n- SPARQL 1.0 has all the basics, and most or a lot of what you&#x2019;ll do is probably going to be covered by it\n- SPARQL 1.1 adds a fair amount of new functionality, and lot of neat syntactical features\n- we are going to include quite a few things from SPARQL 1.1\n- however, there are some (very interesting!) parts of SPARQL 1.1 that we won&#x2019;t cover in this tutorial, such as updates or deletes\n
- throughout this tutorial, we&#x2019;re going to talk about and use what is called &#x201C;SPARQL endpoints&#x201D;\n- so, very briefly, here is what it is:\n- in a nutshell, if we have some RDF, e.g., in a triple store or in any other data source that can provide RDF, then we can add a piece of software called a &#x201C;SPARQL endpoint&#x201D; or a &#x201C;SPARQL processor&#x201D;\n- this endpoint acts as a query interface to the RDF data, in the sense that we can pass a query to it, and it will give a result back\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n- ok, at this point, lets just see what happens when I run this query on some data (run query &#x201C;17_albums_2007.rq&#x201D; in arq on Jamendo data)\n
- in this part of the tutorial, I&#x2019;m going to explain in some detail how you use SPARQL, but here is how it works in a nutshell\n- usually, a SPARQL query consists of (at least) two parts: \n- a first part, which I&#x2019;ll just call a &#x201C;query clause&#x201D;, and a second part, which is called the &#x201C;WHERE clause&#x201D;\n- in the query clause, you say what you want to get, and in what form. There are four different types of queries which you can use (oral examples). \n- this part of the query is usually quite simple (except for a CONSTRUCT clause, which can be as complex as you like)\n- the WHERE clause is where the magic happens - here, you define the filter or the pattern that very precisely lets you pick out parts of the RDF graph\n- you use what is called triple patterns and graph patterns to define this - I&#x2019;m going to talk about this in a minute\n- there can also be some other parts to a SPARQL query, like a prologue and solution modifiers, and we&#x2019;ll get to those as well\n
- in this part of the tutorial, I&#x2019;m going to explain in some detail how you use SPARQL, but here is how it works in a nutshell\n- usually, a SPARQL query consists of (at least) two parts: \n- a first part, which I&#x2019;ll just call a &#x201C;query clause&#x201D;, and a second part, which is called the &#x201C;WHERE clause&#x201D;\n- in the query clause, you say what you want to get, and in what form. There are four different types of queries which you can use (oral examples). \n- this part of the query is usually quite simple (except for a CONSTRUCT clause, which can be as complex as you like)\n- the WHERE clause is where the magic happens - here, you define the filter or the pattern that very precisely lets you pick out parts of the RDF graph\n- you use what is called triple patterns and graph patterns to define this - I&#x2019;m going to talk about this in a minute\n- there can also be some other parts to a SPARQL query, like a prologue and solution modifiers, and we&#x2019;ll get to those as well\n
- in this part of the tutorial, I&#x2019;m going to explain in some detail how you use SPARQL, but here is how it works in a nutshell\n- usually, a SPARQL query consists of (at least) two parts: \n- a first part, which I&#x2019;ll just call a &#x201C;query clause&#x201D;, and a second part, which is called the &#x201C;WHERE clause&#x201D;\n- in the query clause, you say what you want to get, and in what form. There are four different types of queries which you can use (oral examples). \n- this part of the query is usually quite simple (except for a CONSTRUCT clause, which can be as complex as you like)\n- the WHERE clause is where the magic happens - here, you define the filter or the pattern that very precisely lets you pick out parts of the RDF graph\n- you use what is called triple patterns and graph patterns to define this - I&#x2019;m going to talk about this in a minute\n- there can also be some other parts to a SPARQL query, like a prologue and solution modifiers, and we&#x2019;ll get to those as well\n
- in this part of the tutorial, I&#x2019;m going to explain in some detail how you use SPARQL, but here is how it works in a nutshell\n- usually, a SPARQL query consists of (at least) two parts: \n- a first part, which I&#x2019;ll just call a &#x201C;query clause&#x201D;, and a second part, which is called the &#x201C;WHERE clause&#x201D;\n- in the query clause, you say what you want to get, and in what form. There are four different types of queries which you can use (oral examples). \n- this part of the query is usually quite simple (except for a CONSTRUCT clause, which can be as complex as you like)\n- the WHERE clause is where the magic happens - here, you define the filter or the pattern that very precisely lets you pick out parts of the RDF graph\n- you use what is called triple patterns and graph patterns to define this - I&#x2019;m going to talk about this in a minute\n- there can also be some other parts to a SPARQL query, like a prologue and solution modifiers, and we&#x2019;ll get to those as well\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n
- ok, just to give you a taste, I thought I&#x2019;d jump right in and show you what a medium complex query could look like\n- I&#x2019;m not going to go into detail now, but you can see that the query selects some things like artists and titles (probably of records)\n- there are variables (indicated by the question marks)\n- there are bits of the query that, if you know Turtle or N3 syntax, look like triples\n- there is a FILTER with functions in it\n- there is a command that seems to sort or order the query results\n
\n
\n
- this is some Jamendo example data as a visual graph, with circles and boxes and arrows\n
- thats&#x2019;s the same data in Turtle syntax\n- we&#x2019;re getting closer to SPARQL already - SPARQL is using an almost identical syntax to Turtle\n
- and here is the same data again, only this time it&#x2019;s shown one triple at a time\n- (i.e., in ntriples syntax)\n- you&#x2019;ll see in a minute why I am showing it this way\n
- SPARQL is a lot of things and has a lot of features\n- but a lot of it comes down to triple and graph patterns\n- because that is how you pick and select parts of an RDF graph to answer your query\n\n
- these patterns are what you put in your WHERE clause, which the SPARQL engine then tries to match against your RDF data\n- so, the bit in bold here is a graph pattern\n- in the next few slides, I&#x2019;m going to show you in more detail how these patterns work\n
- so, what are triple patterns?\n- in the easiest case, it&#x2019;s just a triple, like this one here\n- triple pattern with no variables\n
- since there are no variables, this pattern can match at most one triple in our data\n- (if that triple is in there)\n
\n
- triple pattern with one variable\n
\n
\n
\n
\n
- each of these triples is one result binding\n
- triple pattern with two variables\n- by the way - I should say that the names of these variables don&#x2019;t matter. You could call them anything you like.\n
\n
\n
\n
\n
\n
\n
\n
- triple pattern with two variables\n- by the way - I should say that the names of these variables don&#x2019;t matter. You could call them anything you like.\n
\n
- obviously, with three variables, every triple matches, so you get everything back\n
- triple pattern with two variables\n- by the way - I should say that the names of these variables don&#x2019;t matter. You could call them anything you like.\n
\n
\n
\n
- before I now go through a complete scenario with the Jamendo data and some Geonames data, I&#x2019;ll quickly look at the practical question of &#x201C;how does one actually interact with a SPARQL endpoint&#x201D;\n
\n
\n
\n
\n
\n
- something that is very useful to do once you start working with a new dataset is to get an overview of what is in there\n- you can do that with a couple of very simple SPARQL SELECT queries\n- for example, we could very easily ask for all types of things (i.e., classes) in our dataset\n- we just use a triple pattern with the &#x201C;a&#x201D; abbreviation for &#x201C;rdf:type&#x201D; in the WHERE clause, and SELECT the binding for the ?class variable\n- note that we are adding the DISTINCT keyword to the SELECT clause, to prevent duplicates (there could very well be duplicates, as there could be many triples matching the pattern with the same binding for ?class)\n- we get a list of classes back. Let&#x2019;s look closer at the MusicArtist class \n
- something that is very useful to do once you start working with a new dataset is to get an overview of what is in there\n- you can do that with a couple of very simple SPARQL SELECT queries\n- for example, we could very easily ask for all types of things (i.e., classes) in our dataset\n- we just use a triple pattern with the &#x201C;a&#x201D; abbreviation for &#x201C;rdf:type&#x201D; in the WHERE clause, and SELECT the binding for the ?class variable\n- note that we are adding the DISTINCT keyword to the SELECT clause, to prevent duplicates (there could very well be duplicates, as there could be many triples matching the pattern with the same binding for ?class)\n- we get a list of classes back. Let&#x2019;s look closer at the MusicArtist class \n
- since we&#x2019;re interested in artists, it helps to know which properties are used for artists in the dataset\n- we can find out about that in a similar way, by using a simple SELECT query\n- we&#x2019;re also defining a prefix definition at the top (in the &#x201C;prologue&#x201D;), so that we can abbreviate the URI of MusicArtist\n- now, it would be even more interesting if we could know what types of things these predicates link to ...\n
- here we&#x2019;re adding a third triple pattern to our graph pattern - the type of whatever is bound to the ?obj variable\n- looking at the result here, we can see that the foaf:made property links to things of type mo:Record. That&#x2019;s good, because remember we want to find all albums for a particular month.\n- however, note that we only get one result!\n- why is that? - there are no type definitions for the objects of most of the artist properties in the dataset\n- since the SPARQL processor is trying to match the whole graph pattern, triples with those properties don&#x2019;t match\n
- we can use the OPTIONAL keyword to add parts to the pattern that don&#x2019;t have to match - they are, as the name implies, optional\n- if the optional pattern matches, the matching triples are added to the result\n- if not, we still get a result\n\n
- we can do one more exploration step, just like before, only this time we&#x2019;re looking for properties of records instead of artists\n- you&#x2019;ll notice two things: \n- I have added some more prefix definitions in the prologue - we don&#x2019;t actually need them in the query, but the arq tool picks them up for the results as well, which makes them a look a little less cluttered and more readable\n- also, you&#x2019;ll notice that the mo:available_as property actually links to objects of three different types, which is why it shows up three times in the results table\n- however, because the results are not ordered in any way, this is a bit hard to see at first\n\n
- we can do one more exploration step, just like before, only this time we&#x2019;re looking for properties of records instead of artists\n- you&#x2019;ll notice two things: \n- I have added some more prefix definitions in the prologue - we don&#x2019;t actually need them in the query, but the arq tool picks them up for the results as well, which makes them a look a little less cluttered and more readable\n- also, you&#x2019;ll notice that the mo:available_as property actually links to objects of three different types, which is why it shows up three times in the results table\n- however, because the results are not ordered in any way, this is a bit hard to see at first\n\n
- we can use the ORDER BY keyword in the results modifier part after the WHERE clause to order the results based on one (or more) variables\n- so, in this case, we are ordering the results alphabetically in ascending order by the ?pred variable (the URI, not the prefix!), and then by the ?class variable (in descending order)\n- ascending order is default, so you don&#x2019;t actually have to specify that\n- now, remember we want to find records from a particular month. We can see here that records have a dc:date property, so we can hopefully use that to get what we want.\n
- we can use the ORDER BY keyword in the results modifier part after the WHERE clause to order the results based on one (or more) variables\n- so, in this case, we are ordering the results alphabetically in ascending order by the ?pred variable (the URI, not the prefix!), and then by the ?class variable (in descending order)\n- ascending order is default, so you don&#x2019;t actually have to specify that\n- now, remember we want to find records from a particular month. We can see here that records have a dc:date property, so we can hopefully use that to get what we want.\n
- we can use the ORDER BY keyword in the results modifier part after the WHERE clause to order the results based on one (or more) variables\n- so, in this case, we are ordering the results alphabetically in ascending order by the ?pred variable (the URI, not the prefix!), and then by the ?class variable (in descending order)\n- ascending order is default, so you don&#x2019;t actually have to specify that\n- now, remember we want to find records from a particular month. We can see here that records have a dc:date property, so we can hopefully use that to get what we want.\n
- we can use the ORDER BY keyword in the results modifier part after the WHERE clause to order the results based on one (or more) variables\n- so, in this case, we are ordering the results alphabetically in ascending order by the ?pred variable (the URI, not the prefix!), and then by the ?class variable (in descending order)\n- ascending order is default, so you don&#x2019;t actually have to specify that\n- now, remember we want to find records from a particular month. We can see here that records have a dc:date property, so we can hopefully use that to get what we want.\n
- SPARQL 1.1 offers some nice functions to deal with dates, but they only work with properly formatted xsd:dateTime literals\n- unfortunately, however, in the source date we have plain literals with a slightly wrong format\n- what can we do? It would be great if we could create the correct RDF data from the existing, slightly wrong data\n- luckily, SPARQL has another query form which lets us create RDF graphs!\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- in a CONSTRUCT clause, you build triple and graph patterns in the same way as you do in the WHERE clause\n- instead of a list of variable bindings - like for SELECT - you now get an RDF graph as the result of the query, with the variables bound to the values in the matching parts of the source graph\n- in this query, there are several things going on:\n- first of all, I use REPLACE to do a simple string search and replace in the original ?date value - I replace spaces with a &#x201C;T&#x201D;. REPLACE is one of the many functions that SPARQL offers to test or manipulate values\n- another one is STRDT, which I use to assign the datatype xsd:dateTime to the new value\n- to make sure I can use this new value in my CONSTRUCT clause, I&#x2019;m using the BIND AS keyword. In general, you use BIND AS to create some new or derived values that were not part of the source data. Here, I&#x2019;m introducing the new variable ?date_fixed, which I want to bind to the date in the correct format. This new variable will be added to the result.\n- finally, in the CONSTRUCT clause, I define the triple pattern for the output. What happens is that for every matching pattern from the WHERE clause, I will create this new pattern. The only thing that is different between the two is actually the object, where I use the ?data_fixed variable\n- you can, however, create any RDF you want in a CONSTRUCT\n- ok, the final output of this query is an RDF graph like this\n\n
- the previous query only gave us the records and their dates - this would be enough for our use case, but lets assume for a moment that we want to create a fixed copy of the complete dataset\n- we could do that by constructing a second graph without any dc:date statements\n- if we join that with the previous graph, we have our complete, fixed dataset\n- to do this, let me introduce the FILTER command. You add that to the WHERE clause in order to filter the results. You can specify a condition that is checked every time the graph pattern matches. The result will only be added if the FILTER expression is also true.\n- In this case, the pattern in the WHERE clause is the infamous ?s ?p ?o pattern, so it matches every triple on the source data. Without a filter, this would give us the complete dataset back.\n- however, the FILTER, which takes a boolean expression, defines a condition that is only met if the value of ?p is not dce:date. In other words, every time we come across a triple specifying a date, we remove that from the list of results. The end result is a copy of the original dataset, minus all dce:date triples.\n- these were two examples for CONSTRUCT queries - there will be a lot more and more detailed ones in the schema mapping part in the afternoon\n\n
- great! Now we have everything we need to formulate our query of &#x201C;give me all albums made in July 2007&#x201D;\n- this query doesn&#x2019;t introduce anything new, but uses features that we already know\n- we create a graph pattern that gets us all instances of mo:Record, as well as their ?release_date and ?title.\n- we add a filter to say that we only want to have albums where the year of the release date is 2007, and the month is July. For this, we use two of the date functions that were introduced in SPARQL 1.1 \n- Actually, we only did the last few steps of creating new data because we wanted to use these functions, which only work on xsd:dateTime literals.\n- finally, we order the results by release date\n
\n
- ok remember we also wanted to find out all artists in France?\n- well, it turns out that the Jamendo dataset already has part of the answer for that - the foaf:based_near property. Let&#x2019;s see what that gives us:\n- a list of geonames URIs\n- unfortunately, the Jamendo dataset doesn&#x2019;t have any other information about these URIs, so we cannot know what they refer to, let alone which of them are in France!\n- therefore, we need to query the geonames dataset itself\n
- ok remember we also wanted to find out all artists in France?\n- well, it turns out that the Jamendo dataset already has part of the answer for that - the foaf:based_near property. Let&#x2019;s see what that gives us:\n- a list of geonames URIs\n- unfortunately, the Jamendo dataset doesn&#x2019;t have any other information about these URIs, so we cannot know what they refer to, let alone which of them are in France!\n- therefore, we need to query the geonames dataset itself\n
- now, geonames doesn&#x2019;t have its own, authoritative SPARQL endpoint, but there are several other ones out there\n- I&#x2019;m going to use the Kasabi one. \n- Factforge contains several different datasets behind the same endpoint (and is a little bit slower)\n
- to query a remote SPARQL endpoint, we can use the SERVICE keyword in the WHERE clause\n- what it does is indicate that for the following graph patterns the external endpoint should be queried, rather than local data\n- the query itself is very simple - we just want to explore geonames to get all the properties of geonames resources\n- some interesting properties that we get are the names and parent countries of places\n- you see the apikey parameter in the service URI - that something specific to Kasabi. Most SPARQL endpoints won&#x2019;t require a URI key\n- you can use the key here to experiment, it&#x2019;s for the &#x201C;latc-tutorial&#x201D; user\n- however, for serious use you should sign up yourself and get your own API key\n
- to query a remote SPARQL endpoint, we can use the SERVICE keyword in the WHERE clause\n- what it does is indicate that for the following graph patterns the external endpoint should be queried, rather than local data\n- the query itself is very simple - we just want to explore geonames to get all the properties of geonames resources\n- some interesting properties that we get are the names and parent countries of places\n- you see the apikey parameter in the service URI - that something specific to Kasabi. Most SPARQL endpoints won&#x2019;t require a URI key\n- you can use the key here to experiment, it&#x2019;s for the &#x201C;latc-tutorial&#x201D; user\n- however, for serious use you should sign up yourself and get your own API key\n
- to query a remote SPARQL endpoint, we can use the SERVICE keyword in the WHERE clause\n- what it does is indicate that for the following graph patterns the external endpoint should be queried, rather than local data\n- the query itself is very simple - we just want to explore geonames to get all the properties of geonames resources\n- some interesting properties that we get are the names and parent countries of places\n- you see the apikey parameter in the service URI - that something specific to Kasabi. Most SPARQL endpoints won&#x2019;t require a URI key\n- you can use the key here to experiment, it&#x2019;s for the &#x201C;latc-tutorial&#x201D; user\n- however, for serious use you should sign up yourself and get your own API key\n
- to query a remote SPARQL endpoint, we can use the SERVICE keyword in the WHERE clause\n- what it does is indicate that for the following graph patterns the external endpoint should be queried, rather than local data\n- the query itself is very simple - we just want to explore geonames to get all the properties of geonames resources\n- some interesting properties that we get are the names and parent countries of places\n- you see the apikey parameter in the service URI - that something specific to Kasabi. Most SPARQL endpoints won&#x2019;t require a URI key\n- you can use the key here to experiment, it&#x2019;s for the &#x201C;latc-tutorial&#x201D; user\n- however, for serious use you should sign up yourself and get your own API key\n
- we can easily combine local and remote sources, as you can see here\n- this is where things become interesting!\n- part of the graph pattern is queried locally, the other one over the remote endpoint\n- we get the artist&#x2019;s location locally from the Jamendo dataset, and then we get the name of that location from the remote SPARQL endpoint\n- note the LIMIT statement at the end of the query - if I didn&#x2019;t do that, the query would time out\n- when querying data, you always have a to be a little bit careful - it&#x2019;s easy to write a query that takes a very long time to execute, that let&#x2019;s the processor run out of memory, etc.\n- especially when querying remote services\n
- before we can ask which artists are in France, we need to know the identifier for France\n- for this, we need to query the remote endpoint again\n- we ask for resources which have a geonames:featureCode A.PCLI (i.e., they are a country)\n- then we filter the results with a regular expression\n- et voil&#xE0;, France is <http://sws.geonames.org/3017382/>!\n
- so, now that we have the identifier for France, we can do a first attempt to query for artists in France\n- this is great!\n- however, this actually only gives us those artists which directly use France, the country as their location\n- this means we are missing those which specify places like Paris or Lyon as their location - places which located within France!\n- we want places which have France as geonames:parentCountry \n
- we can get this information from geonames, so we do a federated query again\n- notice I&#x2019;m limiting the result to 10, otherwise it would simply take way to long, or even time out\n- even with the LIMIT it takes ages for this query to finish (several minutes)\n- this illustrates that, while it&#x2019;s great to have online SPARQL endpoints, for many purposes they&#x2019;re not feasible\n- what we can do in a case like that is extract a slice of data from the endpoint, and run the query again locally\n
- we extract data about all places in France and their names from geonames\n- this is a very practical approach of handling external datasets, which might otherwise be too large to query efficiently (especially in a federated scenario)\n- the result is written to a file\n
- this is our final query, giving us all artists in Jamendo who are located in France!\n- I&#x2019;m introducing one new thing here, the FROM keyword, which you can use to load different data sources into the default graph\n- here we are loading two local files, but you can just as well point to a URI on the Web\n- notice that, because <http://sws.geonames.org/3017382/> geonames:parentCountry <http://sws.geonames.org/3017382/>, \n
- this is our final query, giving us all artists in Jamendo who are located in France!\n- I&#x2019;m introducing one new thing here, the FROM keyword, which you can use to load different data sources into the default graph\n- here we are loading two local files, but you can just as well point to a URI on the Web\n- notice that, because <http://sws.geonames.org/3017382/> geonames:parentCountry <http://sws.geonames.org/3017382/>, \n
- so far, we always assume that everything is in the same big RDF graph\n- that&#x2019;s what is usually called the &#x201C;default graph&#x201D; in SPARQL\n- however, SPARQL knows about named graphs\n- we can load data into a named graph, rather than the default one, and then reference that graph in the query\n- e.g., here, we load the Geonames slice into a named graph with FROM NAMED (the name is its URI)\n- in the WHERE clause, we can then restrict part of the query to that graph\n- why would we want to do this? A SPARQL engine could be optimised for this, and the query might run faster\n
- so far, we always assume that everything is in the same big RDF graph\n- that&#x2019;s what is usually called the &#x201C;default graph&#x201D; in SPARQL\n- however, SPARQL knows about named graphs\n- we can load data into a named graph, rather than the default one, and then reference that graph in the query\n- e.g., here, we load the Geonames slice into a named graph with FROM NAMED (the name is its URI)\n- in the WHERE clause, we can then restrict part of the query to that graph\n- why would we want to do this? A SPARQL engine could be optimised for this, and the query might run faster\n
\n
- so far, we always assume that everything is in the same big RDF graph\n- that&#x2019;s what is usually called the &#x201C;default graph&#x201D; in SPARQL\n- however, SPARQL knows about named graphs\n- we can load data into a named graph, rather than the default one, and then reference that graph in the query\n- e.g., here, we load the Geonames slice into a named graph with FROM NAMED (the name is its URI)\n- in the WHERE clause, we can then restrict part of the query to that graph\n- why would we want to do this? A SPARQL engine could be optimised for this, and the query might run faster\n
- so far, we always assume that everything is in the same big RDF graph\n- that&#x2019;s what is usually called the &#x201C;default graph&#x201D; in SPARQL\n- however, SPARQL knows about named graphs\n- we can load data into a named graph, rather than the default one, and then reference that graph in the query\n- e.g., here, we load the Geonames slice into a named graph with FROM NAMED (the name is its URI)\n- in the WHERE clause, we can then restrict part of the query to that graph\n- why would we want to do this? A SPARQL engine could be optimised for this, and the query might run faster\n