SlideShare ist ein Scribd-Unternehmen logo
1 von 338
Downloaden Sie, um offline zu lesen
Federation and Navigation in SPARQL 1.1
Jorge P´erez
Assistant Professor
Department of Computer Science
Universidad de Chile
Outline
Basics of SPARQL
Syntax and Semantics of SPARQL 1.0
What is new in SPARQL 1.1
Federation: SERVICE operator
Syntax and Semantics
Evaluation of SERVICE queries
Navigation: Property Paths
Navigating graphs with regular expressions
The history of paths (in SPARQL 1.1 specification)
Evaluation procedures and complexity
SPARQL query language for RDF
SPARQL query language for RDF
RDF Graph:
fmeza@utalca.cl
:name
:email
:phone
:name
:friendOf rgarrido@utalca.cl:email
Federico Meza
35-446928
Ruth Garrido
URI2
URI1
RDF-triples: (URI2, :email, rgarrido@utalca.cl)
SPARQL query language for RDF
RDF Graph:
fmeza@utalca.cl
:name
:email
:phone
:name
:friendOf rgarrido@utalca.cl:email
Federico Meza
35-446928
Ruth Garrido
URI2
URI1
RDF-triples: (URI2, :email, rgarrido@utalca.cl)
SPARQL Query:
SELECT ?N
WHERE
{
?X :name ?N .
}
SPARQL query language for RDF
RDF Graph:
fmeza@utalca.cl
:name
:email
:phone
:name
:friendOf rgarrido@utalca.cl:email
Federico Meza
35-446928
Ruth Garrido
URI2
URI1
RDF-triples: (URI2, :email, rgarrido@utalca.cl)
SPARQL Query:
SELECT ?N ?E
WHERE
{
?X :name ?N .
?X :email ?E .
}
SPARQL query language for RDF
RDF Graph:
fmeza@utalca.cl
:name
:email
:phone
:name
:friendOf rgarrido@utalca.cl:email
Federico Meza
35-446928
Ruth Garrido
URI2
URI1
RDF-triples: (URI2, :email, rgarrido@utalca.cl)
SPARQL Query:
SELECT ?N ?E
WHERE
{
?X :name ?N .
?X :email ?E .
?X :friendOf ?Y . ?Y :name "Ruth Garrido"
}
An example of an RDF graph to query: DBLP
inAMW:SarmaUW09 :Jeffrey D. Ullman
:Anish Das Sarma
:Jennifer Widom
inAMW:2009
"Schema Design for ..."
dc:creator
dc:creator
dc:creator
dct:partOf
dc:title
swrc:series
conf:amw
<http://purl.org/dc/terms/>
: <http://dblp.l3s.de/d2r/resource/authors/>
conf: <http://dblp.l3s.de/d2r/resource/conferences/>
inAMW: <http://dblp.l3s.de/d2r/resource/publications/conf/amw/>
swrc: <http://swrc.ontoware.org/ontology#>
dc:
dct:
<http://purl.org/dc/elements/1.1/>
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
}
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
}
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
}
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
}
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
}
A SPARQL query consists of a:
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
}
A SPARQL query consists of a:
Head: Processing of the variables
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC
SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
}
A SPARQL query consists of a:
Head: Processing of the variables
Body: Pattern matching expression
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC, and their Web
pages if this information is available:
SELECT ?Author ?WebPage
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
OPTIONAL {
?Author foaf:homePage ?WebPage . }
}
SPARQL: A Simple RDF Query Language
Example: Authors that have published in ISWC, and their Web
pages if this information is available:
SELECT ?Author ?WebPage
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
OPTIONAL {
?Author foaf:homePage ?WebPage . }
}
But things can become more complex...
Interesting features of pattern
matching on graphs SELECT ?X1 ?X2 ...
{ P1 .
P2 }
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
SELECT ?X1 ?X2 ...
{{ P1 .
P2 }
{ P3 .
P4 }
}
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
◮ Optional parts
SELECT ?X1 ?X2 ...
{{ P1 .
P2
OPTIONAL { P5 } }
{ P3 .
P4
OPTIONAL { P7 } }
}
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
SELECT ?X1 ?X2 ...
{{ P1 .
P2
OPTIONAL { P5 } }
{ P3 .
P4
OPTIONAL { P7
OPTIONAL { P8 } } }
}
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
SELECT ?X1 ?X2 ...
{{{ P1 .
P2
OPTIONAL { P5 } }
{ P3 .
P4
OPTIONAL { P7
OPTIONAL { P8 } } }
}
UNION
{ P9 }}
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
◮ + several new features in
the upcoming version:
federation, navigation
SELECT ?X1 ?X2 ...
{{{ P1 .
P2
OPTIONAL { P5 } }
{ P3 .
P4
OPTIONAL { P7
OPTIONAL { P8 } } }
}
UNION
{ P9
FILTER ( R ) }}
But things can become more complex...
Interesting features of pattern
matching on graphs
◮ Grouping
◮ Optional parts
◮ Nesting
◮ Union of patterns
◮ Filtering
◮ ...
◮ + several new features in
the upcoming version:
federation, navigation
SELECT ?X1 ?X2 ...
{{{ P1 .
P2
OPTIONAL { P5 } }
{ P3 .
P4
OPTIONAL { P7
OPTIONAL { P8 } } }
}
UNION
{ P9
FILTER ( R ) }}
What is the (formal) meaning of a general SPARQL query?
Outline
Basics of SPARQL
Syntax and Semantics of SPARQL 1.0
What is new in SPARQL 1.1
Federation: SERVICE operator
Syntax and Semantics
Evaluation of SERVICE queries
Navigation: Property Paths
Navigating graphs with regular expressions
The history of paths (in SPARQL 1.1 specification)
Evaluation procedures and complexity
RDF triples and graphs
Subject Object
Predicate
LB
U
U UB
U : set of URIs
B : set of blank nodes
L : set of literals
RDF triples and graphs
Subject Object
Predicate
LB
U
U UB
U : set of URIs
B : set of blank nodes
L : set of literals
(s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple
RDF triples and graphs
Subject Object
Predicate
LB
U
U UB
U : set of URIs
B : set of blank nodes
L : set of literals
(s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple
A set of RDF triples is called an RDF graph
RDF triples and graphs
Subject Object
Predicate
LB
U
U UB
U : set of URIs
B : set of blank nodes
L : set of literals
(s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple
A set of RDF triples is called an RDF graph
In this talk, we do not consider blank nodes
◮ (s, p, o) ∈ U × U × (U ∪ L) is called an RDF triple
A standard algebraic syntax
◮ Triple patterns: just RDF triples + variables (from a set V )
?X :name "john" (?X, name, john)
A standard algebraic syntax
◮ Triple patterns: just RDF triples + variables (from a set V )
?X :name "john" (?X, name, john)
◮ Graph patterns: full parenthesized algebra
original SPARQL syntax algebraic syntax
{ P1 . P2 } ( P1 AND P2 )
A standard algebraic syntax
◮ Triple patterns: just RDF triples + variables (from a set V )
?X :name "john" (?X, name, john)
◮ Graph patterns: full parenthesized algebra
original SPARQL syntax algebraic syntax
{ P1 . P2 } ( P1 AND P2 )
{ P1 OPTIONAL { P2 }} ( P1 OPT P2 )
A standard algebraic syntax
◮ Triple patterns: just RDF triples + variables (from a set V )
?X :name "john" (?X, name, john)
◮ Graph patterns: full parenthesized algebra
original SPARQL syntax algebraic syntax
{ P1 . P2 } ( P1 AND P2 )
{ P1 OPTIONAL { P2 }} ( P1 OPT P2 )
{ P1 } UNION { P2 } ( P1 UNION P2 )
A standard algebraic syntax
◮ Triple patterns: just RDF triples + variables (from a set V )
?X :name "john" (?X, name, john)
◮ Graph patterns: full parenthesized algebra
original SPARQL syntax algebraic syntax
{ P1 . P2 } ( P1 AND P2 )
{ P1 OPTIONAL { P2 }} ( P1 OPT P2 )
{ P1 } UNION { P2 } ( P1 UNION P2 )
{ P1 FILTER ( R ) } ( P1 FILTER R )
A standard algebraic syntax (cont.)
◮ Explicit precedence/association
Example
{ t1
t2
OPTIONAL { t3 }
OPTIONAL { t4 }
t5
}
( ( ( ( t1 AND t2 ) OPT t3 ) OPT t4 ) AND t5 )
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
◮ µ(t): triple obtained from t replacing variables according to µ
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
◮ µ(t): triple obtained from t replacing variables according to µ
Example
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
◮ µ(t): triple obtained from t replacing variables according to µ
Example
µ = {?X → R1, ?Y → R2, ?Name → john}
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
◮ µ(t): triple obtained from t replacing variables according to µ
Example
µ = {?X → R1, ?Y → R2, ?Name → john}
t = (?X, name, ?Name)
Mappings: building block for the semantics
Definition
A mapping is a partial function from variables to RDF terms
µ : V → U ∪ L
Given a mapping µ and a triple pattern t:
◮ µ(t): triple obtained from t replacing variables according to µ
Example
µ = {?X → R1, ?Y → R2, ?Name → john}
t = (?X, name, ?Name)
µ(t) = (R1, name, john)
The semantics of triple patterns
Definition
The evaluation of triple patter t over a graph G, denoted by t G ,
is the set of all mappings µ such that:
The semantics of triple patterns
Definition
The evaluation of triple patter t over a graph G, denoted by t G ,
is the set of all mappings µ such that:
◮ dom(µ) is exactly the set of variables occurring in t
The semantics of triple patterns
Definition
The evaluation of triple patter t over a graph G, denoted by t G ,
is the set of all mappings µ such that:
◮ dom(µ) is exactly the set of variables occurring in t
◮ µ(t) ∈ G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(?X, name, ?N) G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(?X, name, ?N) G
µ1 = {?X → R1, ?N → john}
µ2 = {?X → R2, ?N → paul}
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(?X, name, ?N) G
µ1 = {?X → R1, ?N → john}
µ2 = {?X → R2, ?N → paul}
(?X, email, ?E) G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(?X, name, ?N) G
µ1 = {?X → R1, ?N → john}
µ2 = {?X → R2, ?N → paul}
(?X, email, ?E) G
µ = {?X → R1, ?E → J@ed.ex}
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(?X, name, ?N) G
?X ?N
µ1 R1 john
µ2 R2 paul
(?X, email, ?E) G
?X ?E
µ R1 J@ed.ex
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(R1, webPage, ?W ) G
(R2, name, paul) G
(R3, name, ringo) G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(R1, webPage, ?W ) G
(R2, name, paul) G
(R3, name, ringo) G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(R1, webPage, ?W ) G
(R2, name, paul) G
(R3, name, ringo) G
Example
G
(R1, name, john)
(R1, email, J@ed.ex)
(R2, name, paul)
(R1, webPage, ?W ) G
(R2, name, paul) G
µ∅ = { }
(R3, name, ringo) G
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
µ1 ∪ µ2 R1 john J@edu.ex
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
µ1 ∪ µ2 R1 john J@edu.ex
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
µ1 ∪ µ2 R1 john J@edu.ex
µ1 ∪ µ3 R1 john P@edu.ex R2
Compatible mappings: mappings that can be merged.
Definition
The mappings µ1, µ2 are compatibles iff they agree
in their shared variables:
◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
µ1 ∪ µ2 is also a mapping.
Example
?X ?Y ?U ?V
µ1 R1 john
µ2 R1 J@edu.ex
µ3 P@edu.ex R2
µ1 ∪ µ2 R1 john J@edu.ex
µ1 ∪ µ3 R1 john P@edu.ex R2
µ∅ = { } is compatible with every mapping.
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 ⋊⋉ M2
◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles}
◮ extending mappings in M1 with compatible mappings in M2
will be used to define AND
Sets of mappings and operations
Let M1 and M2 be sets of mappings:
Definition
Join: M1 ⋊⋉ M2
◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles}
◮ extending mappings in M1 with compatible mappings in M2
will be used to define AND
Definition
Union: M1 ∪ M2
◮ {µ | µ ∈ M1 or µ ∈ M2}
◮ mappings in M1 plus mappings in M2 (the usual set union)
will be used to define UNION
Sets of mappings and operations
Definition
Difference: M1 M2
◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles}
◮ mappings in M1 that cannot be extended with mappings in M2
Sets of mappings and operations
Definition
Difference: M1 M2
◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles}
◮ mappings in M1 that cannot be extended with mappings in M2
Definition
Left outer join: M1 M2 = (M1 ⋊⋉ M2) ∪ (M1 M2)
◮ extension of mappings in M1 with compatible mappings in M2
◮ plus the mappings in M1 that cannot be extended.
will be used to define OPT
Semantics of general graph patterns
Definition
Given a graph G the evaluation of a pattern is recursively defined
the base case is the evaluation of a triple pattern.
Semantics of general graph patterns
Definition
Given a graph G the evaluation of a pattern is recursively defined
◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G
the base case is the evaluation of a triple pattern.
Semantics of general graph patterns
Definition
Given a graph G the evaluation of a pattern is recursively defined
◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G
◮ (P1 UNION P2) G = P1 G ∪ P2 G
the base case is the evaluation of a triple pattern.
Semantics of general graph patterns
Definition
Given a graph G the evaluation of a pattern is recursively defined
◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G
◮ (P1 UNION P2) G = P1 G ∪ P2 G
◮ (P1 OPT P2) G = P1 G P2 G
the base case is the evaluation of a triple pattern.
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
(?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
(?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
(?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
(?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
⋊⋉
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
Example (AND)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) AND (?X, email, ?E)) G
(?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
⋊⋉
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
Example (OPT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) OPT (?X, email, ?E)) G
(?X, name, ?N) G (?X, email, ?E) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?X ?E
µ4 R1 J@ed.ex
µ5 R3 R@ed.ex
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
(?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
(?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
?X ?Info
µ1 R1 J@ed.ex
µ2 R3 R@ed.ex
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
(?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
?X ?Info
µ1 R1 J@ed.ex
µ2 R3 R@ed.ex
?X ?Info
µ3 R3 www.ringo.com
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
(?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
?X ?Info
µ1 R1 J@ed.ex
µ2 R3 R@ed.ex
∪
?X ?Info
µ3 R3 www.ringo.com
Example (UNION)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
(?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
?X ?Info
µ1 R1 J@ed.ex
µ2 R3 R@ed.ex
∪
?X ?Info
µ3 R3 www.ringo.com
?X ?Info
µ1 R1 J@ed.ex
µ2 R3 R@ed.ex
µ3 R3 www.ringo.com
Boolean filter expressions (value constraints)
In filter expressions we consider
◮ the equality = among variables and RDF terms
◮ a unary predicate bound
◮ boolean combinations (∧, ∨, ¬)
A mapping µ satisfies
◮ ?X = c if µ(?X) = c
◮ ?X =?Y if µ(?X) = µ(?Y )
◮ bound(?X) if µ is defined in ?X, i.e. ?X ∈ dom(µ)
Satisfaction of value constraints
◮ If P is a graph pattern and R is a value constraint then
(P FILTER R) is also a graph pattern.
Satisfaction of value constraints
◮ If P is a graph pattern and R is a value constraint then
(P FILTER R) is also a graph pattern.
Definition
Given a graph G
◮ (P FILTER R) G = {µ ∈ P G | µ satisfies R}
i.e. mappings in the evaluation of P that satisfy R.
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?N = ringo ∨ ?N = paul
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G
?X ?N
µ1 R1 john
µ2 R2 paul
µ3 R3 ringo
?N = ringo ∨ ?N = paul
?X ?N
µ2 R2 paul
µ3 R3 ringo
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
¬ bound(?E)
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
¬ bound(?E)
?X ?N
µ2 R2 paul
Example (FILTER)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
?X ?N ?E
µ1 ∪ µ4 R1 john J@ed.ex
µ3 ∪ µ5 R3 ringo R@ed.ex
µ2 R2 paul
¬ bound(?E)
?X ?N
µ2 R2 paul
(a non-monotonic query)
Why do we need/want to formalize SPARQL
A formalization is beneficial
◮ clarifying corner cases
◮ helping in the implementation process
◮ providing solid foundations (we can actually prove properties!)
The evaluation decision problem
Evaluation problem for SPARQL patterns
Input: A mapping µ, an RDF graph G
a graph pattern P
Output: Is the mapping µ in the evaluation of pattern P
over the RDF graph G
Brief complexity-theory reminder
P
Brief complexity-theory reminder
NP
P
Brief complexity-theory reminder
PSPACE
NP
P
Complexity of the evaluation problem
Theorem (PAG09)
For patterns using only the AND operator,
the evaluation problem is in P
Complexity of the evaluation problem
Theorem (PAG09)
For patterns using only the AND operator,
the evaluation problem is in P
Theorem (SML10)
For patterns using AND and UNION operators,
the evaluation problem is NP-complete.
Complexity of the evaluation problem
Theorem (PAG09)
For patterns using only the AND operator,
the evaluation problem is in P
Theorem (SML10)
For patterns using AND and UNION operators,
the evaluation problem is NP-complete.
Theorem (PAG09,SML10)
For general patterns that include OPT operator,
the evaluation problem is PSPACE-complete.
Complexity of the evaluation problem
Theorem (PAG09)
For patterns using only the AND operator,
the evaluation problem is in P
Theorem (SML10)
For patterns using AND and UNION operators,
the evaluation problem is NP-complete.
Theorem (PAG09,SML10)
For general patterns that include OPT operator,
the evaluation problem is PSPACE-complete.
Good news: evaluation in P if the query is fixed (data complexity)
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
if a variable occurs
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑
if a variable occurs inside B
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT,
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ⇑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT, then
the variable must also occur inside A.
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ⇑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT, then
the variable must also occur inside A.
Example
(?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john)
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ⇑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT, then
the variable must also occur inside A.
Example
(?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john)
↑
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ⇑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT, then
the variable must also occur inside A.
Example
(?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john)
↑ ↑
Well–designed patterns
Can we find a natural fragment with better complexity?
Definition
An AND-OPT pattern is well–designed iff for every OPT in the
pattern
( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )
↑ ⇑ ↑ ↑
if a variable occurs inside B and anywhere outside the OPT, then
the variable must also occur inside A.
Example
(?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john)
↑ ↑
A bit more on complexity...
PSPACE
NP
P
A bit more on complexity...
PSPACE
coNPNP
P
Evaluation of well-designed patterns is in coNP-complete
Theorem (PAG09)
For AND-OPT well–designed graph patterns
the evaluation problem is coNP-complete
Evaluation of well-designed patterns is in coNP-complete
Theorem (PAG09)
For AND-OPT well–designed graph patterns
the evaluation problem is coNP-complete
Well-designed patterns also allow to study static analysis:
Theorem (LPPS12)
Equivalence of well-designed SPARQL patterns is in NP
SELECT (a.k.a. projection)
Besides graph patterns, SPARQL 1.0 allow result forms
the most simple is SELECT
Definition
A SELECT query is an expression
(SELECT W P)
where P is a graph pattern and W is a set of variables, or *
SELECT (a.k.a. projection)
Besides graph patterns, SPARQL 1.0 allow result forms
the most simple is SELECT
Definition
A SELECT query is an expression
(SELECT W P)
where P is a graph pattern and W is a set of variables, or *
The evaluation of a SELECT query against G is
◮ (SELECT W P) G = {µ|W
| µ ∈ P G }
where µ|W
is the restriction of µ to domain W .
◮ (SELECT * P) G = P G
Example (SELECT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G
Example (SELECT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G
SELECT{?N, ?E}
?X ?N ?E
µ1 R1 john J@ed.ex
µ2 R3 ringo R@ed.ex
Example (SELECT)
G :
(R1, name, john) (R2, name, paul) (R3, name, ringo)
(R1, email, J@ed.ex) (R3, email, R@ed.ex)
(R3, webPage, www.ringo.com)
(SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G
SELECT{?N, ?E}
?X ?N ?E
µ1 R1 john J@ed.ex
µ2 R3 ringo R@ed.ex
?N ?E
µ1|{?N,?E}
john J@ed.ex
µ2|{?N,?E}
ringo R@ed.ex
SPARQL 1.1 introduces several new features
In SPARQL 1.1:
◮ (SELECT W P) can be used as any other graph pattern
(ASK P) can be used as a constraint in FILTER
⇒ sub-queries
◮ Aggregations via ORDER-BY plus COUNT, SUM, etc.
◮ More important for us: Federation and Navigation
Outline
Basics of SPARQL
Syntax and Semantics of SPARQL 1.0
What is new in SPARQL 1.1
Federation: SERVICE operator
Syntax and Semantics
Evaluation of SERVICE queries
Navigation: Property Paths
Navigating graphs with regular expressions
The history of paths (in SPARQL 1.1 specification)
Evaluation procedures and complexity
SPARQL endpoints and the SERVICE operator
◮ SPARQL endpoints are services that accepts HTTP requests
asking for SPARQL queries
◮ http://www.w3.org/wiki/SparqlEndpoints lists some
◮ SPARQL 1.1 allows to mix local and remote queries to
endpoints via the SERVICE operator
SPARQL endpoints and the SERVICE operator
◮ SPARQL endpoints are services that accepts HTTP requests
asking for SPARQL queries
◮ http://www.w3.org/wiki/SparqlEndpoints lists some
◮ SPARQL 1.1 allows to mix local and remote queries to
endpoints via the SERVICE operator
?SELECT ?Author
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
}
SPARQL endpoints and the SERVICE operator
◮ SPARQL endpoints are services that accepts HTTP requests
asking for SPARQL queries
◮ http://www.w3.org/wiki/SparqlEndpoints lists some
◮ SPARQL 1.1 allows to mix local and remote queries to
endpoints via the SERVICE operator
?SELECT ?Author ?Place
WHERE
{
?Paper dc:creator ?Author .
?Paper dct:partOf ?Conf .
?Conf swrc:series conf:iswc .
SERVICE dbpedia:service
{ ?Author dbpedia:birthPlace ?Place . }
}
Syntax of SERVICE
Syntax
If c ∈ U, ?X is a variable, and P is a graph pattern then
Syntax of SERVICE
Syntax
If c ∈ U, ?X is a variable, and P is a graph pattern then
◮ (SERVICE c P)
◮ (SERVICE ?X P)
are SERVICE graph patterns.
Syntax of SERVICE
Syntax
If c ∈ U, ?X is a variable, and P is a graph pattern then
◮ (SERVICE c P)
◮ (SERVICE ?X P)
are SERVICE graph patterns.
◮ SERVICE graph pattern are included recursively in the algebra
of graph patterns
◮ Variables are included in the SERVICE syntax to allow
dynamic choosing of endpoints
Semantics of SERVICE
We assume the existence of a partial function
ep : U → RDF graphs
intuitively, ep(c) is the (default) graph associated
to the SPARQL endpoint defined by c
Semantics of SERVICE
We assume the existence of a partial function
ep : U → RDF graphs
intuitively, ep(c) is the (default) graph associated
to the SPARQL endpoint defined by c
Definition
Given an RDF graph G, an element c ∈ U, and a graph pattern P
(SERVICE c P) G =
Semantics of SERVICE
We assume the existence of a partial function
ep : U → RDF graphs
intuitively, ep(c) is the (default) graph associated
to the SPARQL endpoint defined by c
Definition
Given an RDF graph G, an element c ∈ U, and a graph pattern P
(SERVICE c P) G =
P ep(c) if c ∈ dom(ep)
Semantics of SERVICE
We assume the existence of a partial function
ep : U → RDF graphs
intuitively, ep(c) is the (default) graph associated
to the SPARQL endpoint defined by c
Definition
Given an RDF graph G, an element c ∈ U, and a graph pattern P
(SERVICE c P) G =
P ep(c) if c ∈ dom(ep)
{µ∅} otherwise
Semantics of SERVICE
We assume the existence of a partial function
ep : U → RDF graphs
intuitively, ep(c) is the (default) graph associated
to the SPARQL endpoint defined by c
Definition
Given an RDF graph G, an element c ∈ U, and a graph pattern P
(SERVICE c P) G =
P ep(c) if c ∈ dom(ep)
{µ∅} otherwise
but the interesting case is when the endpoint is a variable...
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G =
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G = P ep(c)
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G = P ep(c) ⋊⋉ {{?X → c}}
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G =
c∈dom(ep)
P ep(c) ⋊⋉ {{?X → c}}
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G =
c∈dom(ep)
P ep(c) ⋊⋉ {{?X → c}}
Semantics of SERVICE
Definition
Given an RDF graph G, a variable ?X, and a graph pattern P
(SERVICE ?X P) G =
c∈dom(ep)
P ep(c) ⋊⋉ {{?X → c}}
can we effectively evaluate a SERVICE query?
SERVICE example
Some queries/patterns can be safely evaluated:
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
In both cases, the SERVICE variable is controlled by the data in
the initial graph
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
In both cases, the SERVICE variable is controlled by the data in
the initial graph
There is natural way of evaluating the query:
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
In both cases, the SERVICE variable is controlled by the data in
the initial graph
There is natural way of evaluating the query:
◮ Evaluate first (?X, service address, ?Y )
SERVICE example
Some queries/patterns can be safely evaluated:
◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
In both cases, the SERVICE variable is controlled by the data in
the initial graph
There is natural way of evaluating the query:
◮ Evaluate first (?X, service address, ?Y )
◮ only for the obtained mappings evaluate the SERVICE on
endpoint µ(?Y )
Unbounded SERVICE queries
What about this pattern?
(?X, service description, ?Z) UNION (?X, service address, ?Y )
AND (SERVICE ?Y (?N, email, ?E))
Unbounded SERVICE queries
What about this pattern?
(?X, service description, ?Z) UNION (?X, service address, ?Y )
AND (SERVICE ?Y (?N, email, ?E))
:-(
Unbounded SERVICE queries
What about this pattern?
(?X, service description, ?Z) UNION (?X, service address, ?Y )
AND (SERVICE ?Y (?N, email, ?E))
:-(
Idea: force the SERVICE variable to be bound in every solution.
Boundedness
Definition
A variable ?X is bound in graph pattern P if
for every graph G and every µ ∈ P G it holds that:
◮ ?X ∈ dom(µ), and
◮ µ(?X) is a value in G.
Boundedness
Definition
A variable ?X is bound in graph pattern P if
for every graph G and every µ ∈ P G it holds that:
◮ ?X ∈ dom(µ), and
◮ µ(?X) is a value in G.
We only need a procedure to ensure that every variable mentioned
in SERVICE is bounded!
Boundedness
Definition
A variable ?X is bound in graph pattern P if
for every graph G and every µ ∈ P G it holds that:
◮ ?X ∈ dom(µ), and
◮ µ(?X) is a value in G.
We only need a procedure to ensure that every variable mentioned
in SERVICE is bounded! Oh wait...
Boundedness
Definition
A variable ?X is bound in graph pattern P if
for every graph G and every µ ∈ P G it holds that:
◮ ?X ∈ dom(µ), and
◮ µ(?X) is a value in G.
We only need a procedure to ensure that every variable mentioned
in SERVICE is bounded! Oh wait...
Theorem (BAC11)
The problem of checking if a variable is bound in a graph pattern
is undecidable.
Boundedness
Definition
A variable ?X is bound in graph pattern P if
for every graph G and every µ ∈ P G it holds that:
◮ ?X ∈ dom(µ), and
◮ µ(?X) is a value in G.
We only need a procedure to ensure that every variable mentioned
in SERVICE is bounded! Oh wait...
Theorem (BAC11)
The problem of checking if a variable is bound in a graph pattern
is undecidable.
:-(
Undecidability of boundedness
Proof idea
◮ From [AG08]: it is undecidable to check if a SPARQL pattern
P is satisfiable (if P G = ∅ for some G).
Undecidability of boundedness
Proof idea
◮ From [AG08]: it is undecidable to check if a SPARQL pattern
P is satisfiable (if P G = ∅ for some G).
◮ Assume P does not mention ?X, and let
Q = ((?X, ?Y , ?Z) UNION P):
Undecidability of boundedness
Proof idea
◮ From [AG08]: it is undecidable to check if a SPARQL pattern
P is satisfiable (if P G = ∅ for some G).
◮ Assume P does not mention ?X, and let
Q = ((?X, ?Y , ?Z) UNION P):
?X is bound in Q iff P is not satisfiable.
Undecidability of boundedness
Proof idea
◮ From [AG08]: it is undecidable to check if a SPARQL pattern
P is satisfiable (if P G = ∅ for some G).
◮ Assume P does not mention ?X, and let
Q = ((?X, ?Y , ?Z) UNION P):
?X is bound in Q iff P is not satisfiable.
Undecidable: the SPARQL engine cannot check for boundedness...
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
◮ if P is a SERVICE pattern, then
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
◮ if P is a SERVICE pattern, then SB(P) = ∅
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
◮ if P is a SERVICE pattern, then SB(P) = ∅
Proposition (BAC11)
If ?X ∈ SB(P) then ?X is bound in P.
A sufficient condition for boundedness
Definition [BAC11]
The set of strongly bounded variables in a pattern P, denoted by
SB(P) is defined recursively as follows.
◮ if P is a triple pattern t, then SB(P) = var(t)
◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
◮ if P is a SERVICE pattern, then SB(P) = ∅
Proposition (BAC11)
If ?X ∈ SB(P) then ?X is bound in P.
(SB(P) can be efficiently computed)
(Strongly) boundedness is not enough
Are we happy now?
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E)) .
(Strongly) boundedness is not enough
Are we happy now?
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E)) .
◮ ?Y is not bound in P1 (nor strongly bound)
(Strongly) boundedness is not enough
Are we happy now?
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E)) .
◮ ?Y is not bound in P1 (nor strongly bound)
◮ nevertheless there is a natural evaluation of this pattern
(Strongly) boundedness is not enough
Are we happy now?
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E)) .
◮ ?Y is not bound in P1 (nor strongly bound)
◮ nevertheless there is a natural evaluation of this pattern
(?Y is bounded in the important part!)
(Strongly) boundedness is not enough
Are we happy now?
(Strongly) boundedness is not enough
Are we happy now?
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F)) .
(Strongly) boundedness is not enough
Are we happy now?
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F)) .
◮ ?U1 and ?U2 are strongly bounded, but
(Strongly) boundedness is not enough
Are we happy now?
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F)) .
◮ ?U1 and ?U2 are strongly bounded, but
◮ can we effectively evaluate this query?
(?U2 is unbounded in the important part!)
(Strongly) boundedness is not enough
Are we happy now?
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F)) .
◮ ?U1 and ?U2 are strongly bounded, but
◮ can we effectively evaluate this query?
(?U2 is unbounded in the important part!)
we need to define what the important part is...
Parse tree of a pattern
We need first to formalize the tree of subexpressions of a pattern
(?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))
Parse tree of a pattern
We need first to formalize the tree of subexpressions of a pattern
(?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))
u6 : (?Y , a, ?Z)
u1 : ((?Y , a, ?Z) UNION ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))))
u2 : (?Y , a, ?Z) u3 : ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)))
u4 : (?X, b, c) u5 : (SERVICE ?X (?Y , a, ?Z))
Parse tree of a pattern
We need first to formalize the tree of subexpressions of a pattern
(?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))
u6 : (?Y , a, ?Z)
u1 : ((?Y , a, ?Z) UNION ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))))
u2 : (?Y , a, ?Z) u3 : ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)))
u4 : (?X, b, c) u5 : (SERVICE ?X (?Y , a, ?Z))
We denote by T (P) the tree of subexpressions of P.
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is
bound in P2, and
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is
bound in P2, and
2. P1 is service-bound.
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is
bound in P2, and
2. P1 is service-bound.
Unfortunately... (and not so surprisingly anymore)
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is
bound in P2, and
2. P1 is service-bound.
Unfortunately... (and not so surprisingly anymore)
Theorem (BAC11)
Checking if a pattern is service-bound is undecidable.
Service-boundedness
Notion of boundedness considering the important part
Definition
A pattern P is service-bound if for every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is
bound in P2, and
2. P1 is service-bound.
Unfortunately... (and not so surprisingly anymore)
Theorem (BAC11)
Checking if a pattern is service-bound is undecidable.
Exercise: prove the theorem.
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Definition
A pattern P is service-safe if of every node u in T (P) with label
(SERVICE ?X P1) it holds that:
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Definition
A pattern P is service-safe if of every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t.
?X ∈ SB(P2), and
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Definition
A pattern P is service-safe if of every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t.
?X ∈ SB(P2), and
2. P1 is service-safe.
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Definition
A pattern P is service-safe if of every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t.
?X ∈ SB(P2), and
2. P1 is service-safe.
Proposition (BAC11)
If P is service-safe the it is service-bound.
Service-safeness
We need a decidable sufficient condition.
Idea: replace bound by SB(·) in the previous notion.
Definition
A pattern P is service-safe if of every node u in T (P) with label
(SERVICE ?X P1) it holds that:
1. There exists an ancestor of u in T (P) with label P2 s.t.
?X ∈ SB(P2), and
2. P1 is service-safe.
Proposition (BAC11)
If P is service-safe the it is service-bound.
We finally have our desired decidable condition.
Applying service-safeness
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E))
Applying service-safeness
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E))
is service-safe.
Applying service-safeness
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E))
is service-safe.
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F))
Applying service-safeness
P1 = (?X, service description, ?Z) UNION
(?X, service address, ?Y ) AND
(SERVICE ?Y (?N, email, ?E))
is service-safe.
P2 = (?U1, related with, ?U2) AND
SERVICE ?U1 (?N, email, ?E) OPT
(SERVICE ?U2 (?N, phone, ?F))
is not service-safe.
Closing words on SERVICE
SERVICE: several interesting research question
◮ Optimization (rewriting, reordering, containment?)
◮ Cost analysis: in practice we cannot query all that we may
want
◮ Different endpoints provide different completeness certificates
(regarding the data they return)
◮ Several implementations challenges
Outline
Basics of SPARQL
Syntax and Semantics of SPARQL 1.0
What is new in SPARQL 1.1
Federation: SERVICE operator
Syntax and Semantics
Evaluation of SERVICE queries
Navigation: Property Paths
Navigating graphs with regular expressions
The history of paths (in SPARQL 1.1 specification)
Evaluation procedures and complexity
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
You (should) already know that Datalog can express this query.
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
You (should) already know that Datalog can express this query.
We can consider a new predicate reach and the program
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
You (should) already know that Datalog can express this query.
We can consider a new predicate reach and the program
(?X, reach, ?Y ) ← (?X, connected, ?Y )
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
You (should) already know that Datalog can express this query.
We can consider a new predicate reach and the program
(?X, reach, ?Y ) ← (?X, connected, ?Y )
(?X, reach, ?Y ) ← (?X, reach, ?Z), (?Z, connected, ?Y )
SPARQL 1.0 has very limited navigational capabilities
Assume a graph with cities and connections with RDF triples like:
(C1, connected, C2)
query: is city B reachable from A by a sequence of connections?
◮ Known fact: SPARQL 1.0 cannot express this query!
◮ Follows easily from locality of FO-logic
You (should) already know that Datalog can express this query.
We can consider a new predicate reach and the program
(?X, reach, ?Y ) ← (?X, connected, ?Y )
(?X, reach, ?Y ) ← (?X, reach, ?Z), (?Z, connected, ?Y )
but do we want to integrate Datalog and SPARQL?
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X :friendOf ?Y .
?Y :name "Maria" .
}
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X :friendOf ?Y .
?Y :name "Maria" .
}
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X (:friendOf)* ?Y .
?Y :name "Maria" .
}
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X (:friendOf)* ?Y .
?Y :name "Maria" .
}
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X (:friendOf)* ?Y . ← SPARQL 1.1 property path
?Y :name "Maria" .
}
SPARQL 1.1 provides an alternative way for navigating
URI2
:email
seba@puc.cl
:name
:email
:phone
:name
:friendOf juan@utexas.edu
URI1
Seba
446928888
Juan
Claudio
:name
:name
Maria
URI0
:friendOf
URI3
:friendOf
SELECT ?X
WHERE
{
?X (:friendOf)* ?Y . ← SPARQL 1.1 property path
?Y :name "Maria" .
}
Idea: navigate RDF graphs using regular expressions
General navigation using regular expressions
Regular expressions define sets of strings using
◮ concatenation: /
◮ disjunction: |
◮ Kleene star: *
Example
Consider strings composed of symbols a and b
a/(b)*/a
defines strings of the form abbb · · · bbba.
General navigation using regular expressions
Regular expressions define sets of strings using
◮ concatenation: /
◮ disjunction: |
◮ Kleene star: *
Example
Consider strings composed of symbols a and b
a/(b)*/a
defines strings of the form abbb · · · bbba.
Idea: use regular expressions to define paths
◮ a path p satisfies a regular expression r if the string composed
of the sequence of edges of p satisfies expression r
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X Paris }
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)* Paris }
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)*/:bus Paris }
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }
Exercise: cities that reach Paris with an even number of
connections
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }
Exercise: cities that reach Paris with an even number of
connections
Mixing regular expressions and SPARQL operators
gives interesting expressive power:
Persons in my professional network that attended the same school
Interesting navigational queries
◮ RDF graph with :father, :mother edges:
ancestors of John
{ John (:father|:mother)* ?X }
◮ Connections between cities via :train, :bus, :plane
Cities that reach Paris with exactly one :bus connection
{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }
Exercise: cities that reach Paris with an even number of
connections
Mixing regular expressions and SPARQL operators
gives interesting expressive power:
Persons in my professional network that attended the same school
{ ?X (:conn)* ?Y .
?X (:conn)* ?Z .
?Y :sameSchool ?Z }
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
◮ Late 2010: several discussions on possible drawbacks on the
non-standard definition by the W3C
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
◮ Late 2010: several discussions on possible drawbacks on the
non-standard definition by the W3C
◮ Early 2011: first formal semantics by the W3C
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
◮ Late 2010: several discussions on possible drawbacks on the
non-standard definition by the W3C
◮ Early 2011: first formal semantics by the W3C
◮ Late 2011: empirical and theoretical study on SPARQL 1.1
property paths showing unfeasibility of evaluation
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
◮ Late 2010: several discussions on possible drawbacks on the
non-standard definition by the W3C
◮ Early 2011: first formal semantics by the W3C
◮ Late 2011: empirical and theoretical study on SPARQL 1.1
property paths showing unfeasibility of evaluation
◮ Mid 2012: semantics change to overcome the raised issues
As always, we need a (formal) semantics
◮ Regular expressions in SPARQL queries seem reasonable
◮ We need to agree in the meaning of these new queries
A bit of history on W3C standardization of property paths:
◮ Mid 2010: W3C defines an informal semantics for paths
◮ Late 2010: several discussions on possible drawbacks on the
non-standard definition by the W3C
◮ Early 2011: first formal semantics by the W3C
◮ Late 2011: empirical and theoretical study on SPARQL 1.1
property paths showing unfeasibility of evaluation
◮ Mid 2012: semantics change to overcome the raised issues
The following experimental study is based on [ACP12].
SPARQL 1.1 implementations
had a poor performance
Data:
◮ cliques (complete graphs) of different size
◮ from 2 nodes (87 bytes) to 13 nodes (970 bytes)
:p
:a0
:a1
:a3
:p
:p
:p
:p
:p
:a2
RDF clique with 4 nodes (127 bytes)
SPARQL 1.1 implementations
had a poor performance
1
10
100
1000
2 4 6 8 10 12 14 16
ARQ
+ + + + + + +
+
+
+
+
+
RDFQ
× × × ×
×
×
×
×
×
×
KGram
∗ ∗ ∗ ∗ ∗
∗
∗
∗
∗
∗
Sesame
[ACP12]
SELECT * WHERE { :a0 (:p)* :a1 }
Poor performance with real Web data of small size
Data:
◮ Social Network data given by foaf:knows links
◮ Crawled from Axel Polleres’ foaf document (3 steps)
◮ Different documents, deleting some nodes
foaf:knows
axel:me
ivan:me
bizer:chris
richard:cygri
...
· · ·
andreas:ah
· · ·
· · ·
Poor performance with real Web data of small size
SELECT * WHERE { axel:me (foaf:knows)* ?x }
Poor performance with real Web data of small size
SELECT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame
9.2KB 5.13 75.70 313.37 –
10.9KB 8.20 325.83 – –
11.4KB 65.87 – – –
13.2KB 292.43 – – –
14.8KB – – – –
17.2KB – – – –
20.5KB – – – –
25.8KB – – – – [ACP12]
(time in seconds, timeout = 1hr)
Poor performance with real Web data of small size
SELECT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame
9.2KB 5.13 75.70 313.37 –
10.9KB 8.20 325.83 – –
11.4KB 65.87 – – –
13.2KB 292.43 – – –
14.8KB – – – –
17.2KB – – – –
20.5KB – – – –
25.8KB – – – – [ACP12]
(time in seconds, timeout = 1hr)
Is this a problem of these particular implementations?
This is a problem of the specification
[ACP12]
Any implementation that follows SPARQL 1.1 standard
(as of January 2012) is doomed to show the same behavior
This is a problem of the specification
[ACP12]
Any implementation that follows SPARQL 1.1 standard
(as of January 2012) is doomed to show the same behavior
The main sources of complexity is counting
This is a problem of the specification
[ACP12]
Any implementation that follows SPARQL 1.1 standard
(as of January 2012) is doomed to show the same behavior
The main sources of complexity is counting
Impact on W3C standard:
◮ Standard semantics of SPARQL 1.1 property paths
was changed in July 2012 to overcome these issues
SPARQL 1.1 property paths match regular expressions
but also count
:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
SPARQL 1.1 property paths match regular expressions
but also count
:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
?X
:a
:b
:c
:d
:d
SPARQL 1.1 property paths match regular expressions
but also count
:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
?X
:a
:b
:c
:d
:d
SPARQL 1.1 property paths match regular expressions
but also count
:p:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
?X
:a
:b
:c
:d
:d
SPARQL 1.1 property paths match regular expressions
but also count
:p:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
?X
:a
:b
:c
:d
:d
:c
:d
SPARQL 1.1 property paths match regular expressions
but also count
:p:a
:b
:c
:p
:p
:p
:p
:d
SELECT ?X
WHERE { :a (:p)* ?X }
?X
:a
:b
:c
:d
:d
:c
:d
But what if we have cycles?
SPARQL 1.1 document provides a special procedure
to handle cycles (and make the count)
Evaluation of path*
“the algorithm extends the multiset of results by one application of path.
If a node has been visited for path, it is not a candidate for another step.
A node can be visited multiple times if different paths visit it.”
SPARQL 1.1 Last Call (Jan 2012)
SPARQL 1.1 document provides a special procedure
to handle cycles (and make the count)
Evaluation of path*
“the algorithm extends the multiset of results by one application of path.
If a node has been visited for path, it is not a candidate for another step.
A node can be visited multiple times if different paths visit it.”
SPARQL 1.1 Last Call (Jan 2012)
◮ W3C document provides a procedure (ArbitraryLengthPath)
◮ This procedure was formalized in [ACP12]
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 }
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 }
n # Sol.
9 13,700
10 109,601
11 986,410
12 9,864,101
13 –
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 }
n # Sol.
9 13,700
10 109,601
11 986,410
12 9,864,101
13 –
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 }
n # Sol.
9 13,700
10 109,601
11 986,410
12 9,864,101
13 –
n # Sol
2 1
3 6
4 305
5 418,576
6 –
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 }
n # Sol.
9 13,700
10 109,601
11 986,410
12 9,864,101
13 –
n # Sol
2 1
3 6
4 305
5 418,576
6 –
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
Counting the number of solutions...
Data: Clique of size n
{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 }
n # Sol.
9 13,700
10 109,601
11 986,410
12 9,864,101
13 –
n # Sol
2 1
3 6
4 305
5 418,576
6 –
n # Sol.
2 1
3 42
4 –
every solution is a copy of the empty mapping µ∅ (| | in ARQ)
More on counting the number of solutions...
Data: foaf links crawled from the Web
{ axel:me (foaf:knows)* ?x }
More on counting the number of solutions...
Data: foaf links crawled from the Web
{ axel:me (foaf:knows)* ?x }
File # URIs # Sol. Output Size
9.2KB 38 29,817 2MB
10.9KB 43 122,631 8.4MB
11.4KB 47 1,739,331 120MB
13.2KB 52 8,511,943 587MB
14.8KB 54 – –
More on counting the number of solutions...
Data: foaf links crawled from the Web
{ axel:me (foaf:knows)* ?x }
File # URIs # Sol. Output Size
9.2KB 38 29,817 2MB
10.9KB 43 122,631 8.4MB
11.4KB 47 1,739,331 120MB
13.2KB 52 8,511,943 587MB
14.8KB 54 – –
What is really happening here?
More on counting the number of solutions...
Data: foaf links crawled from the Web
{ axel:me (foaf:knows)* ?x }
File # URIs # Sol. Output Size
9.2KB 38 29,817 2MB
10.9KB 43 122,631 8.4MB
11.4KB 47 1,739,331 120MB
13.2KB 52 8,511,943 587MB
14.8KB 54 – –
What is really happening here?
Theory can help!
A bit more on complexity classes...
Complexity can be measured by using counting-complexity classes
NP #P
Sat: is a propositional CountSat: how many assignments
formula satisfiable? satisfy a propositional formula?
A bit more on complexity classes...
Complexity can be measured by using counting-complexity classes
NP #P
Sat: is a propositional CountSat: how many assignments
formula satisfiable? satisfy a propositional formula?
Formally
A function f (·) on strings is in #P if there exists a
polynomial-time non-deterministic TM M such that
f (x) = number of accepting computations of M with input x
A bit more on complexity classes...
Complexity can be measured by using counting-complexity classes
NP #P
Sat: is a propositional CountSat: how many assignments
formula satisfiable? satisfy a propositional formula?
Formally
A function f (·) on strings is in #P if there exists a
polynomial-time non-deterministic TM M such that
f (x) = number of accepting computations of M with input x
◮ CountSat is #P-complete
Counting problem for property paths
CountW3C
Input: RDF graph G
Property path triple { :a path :b }
Output: Count the number of solutions of { :a path :b } over G
(according to the semantics proposed by W3C)
The complexity of property paths is intractable
Theorem (ACP12)
CountW3C is outside #P
The complexity of property paths is intractable
Theorem (ACP12)
CountW3C is outside #P
CountW3C is hard to solve even if P = NP
A doubly exponential lower bound for counting
◮ Let paths be a property path of the form
(· · · ((:p)*)*)· · ·)*
with s nested stars
A doubly exponential lower bound for counting
◮ Let paths be a property path of the form
(· · · ((:p)*)*)· · ·)*
with s nested stars
◮ Let Kn be a clique with n nodes
A doubly exponential lower bound for counting
◮ Let paths be a property path of the form
(· · · ((:p)*)*)· · ·)*
with s nested stars
◮ Let Kn be a clique with n nodes
◮ Let CountClique(s, n) be the number of solutions of
{ :a0 paths :a1 } over Kn
A doubly exponential lower bound for counting
◮ Let paths be a property path of the form
(· · · ((:p)*)*)· · ·)*
with s nested stars
◮ Let Kn be a clique with n nodes
◮ Let CountClique(s, n) be the number of solutions of
{ :a0 paths :a1 } over Kn
Lemma (ACP12)
CountClique(s, n) ≥ (n − 2)!(n−1)s−1
A doubly exponential lower bound for counting
◮ Let paths be a property path of the form
(· · · ((:p)*)*)· · ·)*
with s nested stars
◮ Let Kn be a clique with n nodes
◮ Let CountClique(s, n) be the number of solutions of
{ :a0 paths :a1 } over Kn
Lemma (ACP12)
CountClique(s, n) ≥ (n − 2)!(n−1)s−1
In [ACP12]: A recursive formula for calculating CountClique(s, n)
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 –
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 –
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 –
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 –
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 –
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 – ← 28 × 109
7 –
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 – ← 28 × 109
7 – ← 144 × 1015
8 –
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 – ← 28 × 109
7 – ← 144 × 1015
8 – ← 79 × 1024
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 – ← 28 × 109
7 – ← 144 × 1015
8 – ← 79 × 1024
79 Yottabytes for the answer over a file of 379 bytes
We can explain the experimental results
CountClique(s, n) allows to fill in the blanks
{ :a0 ((:p)*)* :a1 }
n # Sol.
2 1
3 6
4 305
5 418,576
6 – ← 28 × 109
7 – ← 144 × 1015
8 – ← 79 × 1024
79 Yottabytes for the answer over a file of 379 bytes
1 Yottabyte > the estimated capacity of all digital storage in the world
Data complexity of property path is still intractable
Common assumption in Databases:
◮ queries are much smaller than data sources
Data complexity of property path is still intractable
Common assumption in Databases:
◮ queries are much smaller than data sources
Data complexity
◮ measure the complexity considering the query fixed
Data complexity of property path is still intractable
Common assumption in Databases:
◮ queries are much smaller than data sources
Data complexity
◮ measure the complexity considering the query fixed
◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.
are all polynomial
Data complexity of property path is still intractable
Common assumption in Databases:
◮ queries are much smaller than data sources
Data complexity
◮ measure the complexity considering the query fixed
◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.
are all polynomial
Theorem (ACP12)
Data complexity of CountW3C is #P-complete
Data complexity of property path is still intractable
Common assumption in Databases:
◮ queries are much smaller than data sources
Data complexity
◮ measure the complexity considering the query fixed
◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.
are all polynomial
Theorem (ACP12)
Data complexity of CountW3C is #P-complete
Corollary
SPARQL 1.1 query evaluation (as of January 2012)
is intractable in Data Complexity
Existential semantics: a possible alternative
Possible solution
Do not count
Just check whether there exists a path
satisfying the property path expression
Existential semantics: a possible alternative
Possible solution
Do not count
Just check whether there exists a path
satisfying the property path expression
Years of experiences (theory and practice) in:
◮ Graph Databases
◮ XML
◮ SPARQL 1.0 (PSPARQL, Gleen)
+ equivalent regular expressions giving equivalent results
Existential semantics: decision problems
Input: RDF graph G
Property path triple { :a path :b }
ExistsPath
Question: Is there a path from :a to :b in G satisfying
the regular expression path?
ExistsW3C
Question: Is the number of solutions of { :a path :b } over G
greater than 0 (according to W3C semantics)?
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
2. construct from path an NFA Apath
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
2. construct from path an NFA Apath
3. construct the product automaton G × Apath:
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
2. construct from path an NFA Apath
3. construct the product automaton G × Apath:
◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath
add a transition ((x, p), r, (y, q)) to G × Apath
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
2. construct from path an NFA Apath
3. construct the product automaton G × Apath:
◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath
add a transition ((x, p), r, (y, q)) to G × Apath
4. check if we can go from (:a, q0) to (:b, qf ) in G × Apath
Evaluating existential paths is tractable
Theorem (well-known result)
ExistsPath can be solved in O(|G| × |path|)
Can be proved by using automata theory:
1. consider G as an NFA with :a initial state and :b final state
2. construct from path an NFA Apath
3. construct the product automaton G × Apath:
◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath
add a transition ((x, p), r, (y, q)) to G × Apath
4. check if we can go from (:a, q0) to (:b, qf ) in G × Apath
with q0 initial state of Apath and qf some final state of Apath
Relationship between ExistsPath and ExistsW3C
Theorem (ACP12)
ExistsPath and ExistsW3C are equivalent decision problems
Relationship between ExistsPath and ExistsW3C
Theorem (ACP12)
ExistsPath and ExistsW3C are equivalent decision problems
Corollary (ACP12)
ExistsW3C can be solved in O(|G| × |path|)
So there are possibilities for optimization
SPARQL includes an operator to eliminate duplicates (DISTINCT)
So there are possibilities for optimization
SPARQL includes an operator to eliminate duplicates (DISTINCT)
Corollary
Property path queries with SELECT DISTINCT
can be efficiently evaluated
So there are possibilities for optimization
SPARQL includes an operator to eliminate duplicates (DISTINCT)
Corollary
Property path queries with SELECT DISTINCT
can be efficiently evaluated
And we can also use DISTINCT over general queries
Theorem
SELECT DISTINCT SPARQL 1.1 queries are tractable in Data Complexity
SPARQL 1.1 implementations
do not take advantage of SELECT DISTINCT
SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame Psparql Gleen
SPARQL 1.1 implementations
do not take advantage of SELECT DISTINCT
SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame Psparql Gleen
SPARQL 1.1 implementations
do not take advantage of SELECT DISTINCT
SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame Psparql Gleen
9.2KB 2.24 47.31 2.37 – 0.29 1.39
10.9KB 2.60 204.95 6.43 – 0.30 1.32
11.4KB 6.88 3222.47 80.73 – 0.30 1.34
13.2KB 24.42 – 394.61 – 0.31 1.38
14.8KB – – – – 0.33 1.38
17.2KB – – – – 0.35 1.42
20.5KB – – – – 0.44 1.50
25.8KB – – – – 0.45 1.52
SPARQL 1.1 implementations
do not take advantage of SELECT DISTINCT
SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }
Input ARQ RDFQ Kgram Sesame Psparql Gleen
9.2KB 2.24 47.31 2.37 – 0.29 1.39
10.9KB 2.60 204.95 6.43 – 0.30 1.32
11.4KB 6.88 3222.47 80.73 – 0.30 1.34
13.2KB 24.42 – 394.61 – 0.31 1.38
14.8KB – – – – 0.33 1.38
17.2KB – – – – 0.35 1.42
20.5KB – – – – 0.44 1.50
25.8KB – – – – 0.45 1.52
Optimization possibilities can remain hidden
in a complicated specification
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
For example, consider path1, path2, and path3 not using * then
when evaluating
{ :a path1/(path2)*/path3 :b }
one should (intuitively):
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
For example, consider path1, path2, and path3 not using * then
when evaluating
{ :a path1/(path2)*/path3 :b }
one should (intuitively):
◮ consider all the paths from :a to some intermediate :c1
satisfying path1
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
For example, consider path1, path2, and path3 not using * then
when evaluating
{ :a path1/(path2)*/path3 :b }
one should (intuitively):
◮ consider all the paths from :a to some intermediate :c1
satisfying path1
◮ check if there exists :c2 reachable from :c1 following
(path2)*
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
For example, consider path1, path2, and path3 not using * then
when evaluating
{ :a path1/(path2)*/path3 :b }
one should (intuitively):
◮ consider all the paths from :a to some intermediate :c1
satisfying path1
◮ check if there exists :c2 reachable from :c1 following
(path2)*
◮ consider all the paths from :c2 to :b satisfying path3
New semantics for property paths (July 2012)
◮ Paths constructed only from / and | should be counted
◮ As soon as * is used, duplicates are eliminated (from that part
of the expression)
For example, consider path1, path2, and path3 not using * then
when evaluating
{ :a path1/(path2)*/path3 :b }
one should (intuitively):
◮ consider all the paths from :a to some intermediate :c1
satisfying path1
◮ check if there exists :c2 reachable from :c1 following
(path2)*
◮ consider all the paths from :c2 to :b satisfying path3
◮ make the count (to produce copies)
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
◮ count the number of simple paths
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
◮ count the number of simple paths
◮ does someone from the audience have another in mind?
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
◮ count the number of simple paths
◮ does someone from the audience have another in mind?
In all the above cases you have to decide what exactly you count:
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
◮ count the number of simple paths
◮ does someone from the audience have another in mind?
In all the above cases you have to decide what exactly you count:
◮ the number of paths satisfying the expression?
◮ the number of ways that the expr can be satisfied? (W3C)
Is the new semantics the right one?
W3C definitely wants to count paths.
Are there more reasonable alternatives?
◮ to have different operators for counting (e.g. . and ||)
so the user can decide.
◮ the honest approach: just make the count and output
infininty in the presence of cycles
◮ count the number of simple paths
◮ does someone from the audience have another in mind?
In all the above cases you have to decide what exactly you count:
◮ the number of paths satisfying the expression?
◮ the number of ways that the expr can be satisfied? (W3C)
(they are not the same! consider for example (a|a) )
Several work to do on navigational queries for graphs!
Other lines of research with open questions:
◮ Nested regular expressions and nSPARQL [PAG09,BPR12]
◮ Queries that can output paths (e.g. ECRPQs [BLHW10])
◮ More complexity results on counting paths [LM12]
What is the right language (and the right semantics)
for navigating RDF graphs?
Outline
Basics of SPARQL
Syntax and Semantics of SPARQL 1.0
What is new in SPARQL 1.1
Federation: SERVICE operator
Syntax and Semantics
Evaluation of SERVICE queries
Navigation: Property Paths
Navigating graphs with regular expressions
The history of paths (in SPARQL 1.1 specification)
Evaluation procedures and complexity
Concluding remarks
◮ Federation and Navigation are fundamental features in
SPARQL 1.1
◮ They need formalization and (serious) study
Concluding remarks
◮ Federation and Navigation are fundamental features in
SPARQL 1.1
◮ They need formalization and (serious) study
◮ Do not runaway from Theory! it can really help (and has
helped) to understand the implications of design decisions
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1
Federation and Navigation in SPARQL 1.1

Weitere ähnliche Inhalte

Was ist angesagt?

From SQL to SPARQL
From SQL to SPARQLFrom SQL to SPARQL
From SQL to SPARQLGeorge Roth
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolLeigh Dodds
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialAdonisDamian
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDFSara EL HASSAD
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CIvan Herman
 
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014 Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014 eswcsummerschool
 
New Concepts: Timespan and Place (March 2020)
New Concepts: Timespan and Place (March 2020)New Concepts: Timespan and Place (March 2020)
New Concepts: Timespan and Place (March 2020)ALAeLearningSolutions
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02eswcsummerschool
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031kwangsub kim
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processinglucianb
 
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesSPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesBasil Ell
 
[Master Thesis]: SPARQL Query Rewriting with Paths
[Master Thesis]: SPARQL Query Rewriting with Paths[Master Thesis]: SPARQL Query Rewriting with Paths
[Master Thesis]: SPARQL Query Rewriting with PathsAbdullah Abbas
 
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Goran S. Milovanovic
 

Was ist angesagt? (19)

From SQL to SPARQL
From SQL to SPARQLFrom SQL to SPARQL
From SQL to SPARQL
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query Tool
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
 
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014 Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014
Tutorial: SPARQL 1.0 - I. Fundulaki - ESWC SS 2014
 
4 sw architectures and sparql
4 sw architectures and sparql4 sw architectures and sparql
4 sw architectures and sparql
 
New Concepts: Timespan and Place (March 2020)
New Concepts: Timespan and Place (March 2020)New Concepts: Timespan and Place (March 2020)
New Concepts: Timespan and Place (March 2020)
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processing
 
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine QueriesSPARQL Query Verbalization for Explaining Semantic Search Engine Queries
SPARQL Query Verbalization for Explaining Semantic Search Engine Queries
 
[Master Thesis]: SPARQL Query Rewriting with Paths
[Master Thesis]: SPARQL Query Rewriting with Paths[Master Thesis]: SPARQL Query Rewriting with Paths
[Master Thesis]: SPARQL Query Rewriting with Paths
 
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 

Ähnlich wie Federation and Navigation in SPARQL 1.1

Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesAlexandra Roatiș
 
Learning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQLLearning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQLSara EL HASSAD
 
Programming in Vinyl (BayHac 2014)
Programming in Vinyl (BayHac 2014)Programming in Vinyl (BayHac 2014)
Programming in Vinyl (BayHac 2014)jonsterling
 
The Semantics of SPARQL
The Semantics of SPARQLThe Semantics of SPARQL
The Semantics of SPARQLOlaf Hartig
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)Thomas Francart
 
EKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsEKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsRuben Taelman
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesJose Emilio Labra Gayo
 
SPARQL Query Containment with ShEx Constraints
SPARQL Query Containment with ShEx ConstraintsSPARQL Query Containment with ShEx Constraints
SPARQL Query Containment with ShEx ConstraintsAbdullah Abbas
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataMuhammad Saleem
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC
 
Linking the world with Python and Semantics
Linking the world with Python and SemanticsLinking the world with Python and Semantics
Linking the world with Python and SemanticsTatiana Al-Chueyr
 

Ähnlich wie Federation and Navigation in SPARQL 1.1 (20)

SWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQLSWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQL
 
Incomplete Information in RDF
Incomplete Information in RDFIncomplete Information in RDF
Incomplete Information in RDF
 
Sparql
SparqlSparql
Sparql
 
Optimizing SPARQL Queries with SHACL.pdf
Optimizing SPARQL Queries with SHACL.pdfOptimizing SPARQL Queries with SHACL.pdf
Optimizing SPARQL Queries with SHACL.pdf
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
 
Sparql
SparqlSparql
Sparql
 
Learning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQLLearning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQL
 
Programming in Vinyl (BayHac 2014)
Programming in Vinyl (BayHac 2014)Programming in Vinyl (BayHac 2014)
Programming in Vinyl (BayHac 2014)
 
The Semantics of SPARQL
The Semantics of SPARQLThe Semantics of SPARQL
The Semantics of SPARQL
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streams
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)
 
EKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsEKAW - Triple Pattern Fragments
EKAW - Triple Pattern Fragments
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
 
SPARQL Query Containment with ShEx Constraints
SPARQL Query Containment with ShEx ConstraintsSPARQL Query Containment with ShEx Constraints
SPARQL Query Containment with ShEx Constraints
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape Expressions
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of Data
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data Model
 
Making Topicmaps SPARQL
Making Topicmaps SPARQLMaking Topicmaps SPARQL
Making Topicmaps SPARQL
 
SPARQL
SPARQLSPARQL
SPARQL
 
Linking the world with Python and Semantics
Linking the world with Python and SemanticsLinking the world with Python and Semantics
Linking the world with Python and Semantics
 

Mehr von net2-project

Random Manhattan Indexing
Random Manhattan IndexingRandom Manhattan Indexing
Random Manhattan Indexingnet2-project
 
Extracting Information for Context-aware Meeting Preparation
Extracting Information for Context-aware Meeting PreparationExtracting Information for Context-aware Meeting Preparation
Extracting Information for Context-aware Meeting Preparationnet2-project
 
Vector spaces for information extraction - Random Projection Example
Vector spaces for information extraction - Random Projection ExampleVector spaces for information extraction - Random Projection Example
Vector spaces for information extraction - Random Projection Examplenet2-project
 
Borders of Decidability in Verification of Data-Centric Dynamic Systems
Borders of Decidability in Verification of Data-Centric Dynamic SystemsBorders of Decidability in Verification of Data-Centric Dynamic Systems
Borders of Decidability in Verification of Data-Centric Dynamic Systemsnet2-project
 
Exchanging OWL 2 QL Knowledge Bases
Exchanging OWL 2 QL Knowledge BasesExchanging OWL 2 QL Knowledge Bases
Exchanging OWL 2 QL Knowledge Basesnet2-project
 
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...net2-project
 
Extending DBpedia (LOD) using WikiTables
Extending DBpedia (LOD) using WikiTablesExtending DBpedia (LOD) using WikiTables
Extending DBpedia (LOD) using WikiTablesnet2-project
 
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...net2-project
 
Managing Social Communities
Managing Social CommunitiesManaging Social Communities
Managing Social Communitiesnet2-project
 
Data Exchange over RDF
Data Exchange over RDFData Exchange over RDF
Data Exchange over RDFnet2-project
 
Exchanging more than Complete Data
Exchanging more than Complete DataExchanging more than Complete Data
Exchanging more than Complete Datanet2-project
 
Exchanging More than Complete Data
Exchanging More than Complete DataExchanging More than Complete Data
Exchanging More than Complete Datanet2-project
 
Exchanging More than Complete Data
Exchanging More than Complete DataExchanging More than Complete Data
Exchanging More than Complete Datanet2-project
 
Answer-set programming
Answer-set programmingAnswer-set programming
Answer-set programmingnet2-project
 
Evolving web, evolving search
Evolving web, evolving searchEvolving web, evolving search
Evolving web, evolving searchnet2-project
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)net2-project
 

Mehr von net2-project (17)

Random Manhattan Indexing
Random Manhattan IndexingRandom Manhattan Indexing
Random Manhattan Indexing
 
Extracting Information for Context-aware Meeting Preparation
Extracting Information for Context-aware Meeting PreparationExtracting Information for Context-aware Meeting Preparation
Extracting Information for Context-aware Meeting Preparation
 
Vector spaces for information extraction - Random Projection Example
Vector spaces for information extraction - Random Projection ExampleVector spaces for information extraction - Random Projection Example
Vector spaces for information extraction - Random Projection Example
 
Borders of Decidability in Verification of Data-Centric Dynamic Systems
Borders of Decidability in Verification of Data-Centric Dynamic SystemsBorders of Decidability in Verification of Data-Centric Dynamic Systems
Borders of Decidability in Verification of Data-Centric Dynamic Systems
 
Exchanging OWL 2 QL Knowledge Bases
Exchanging OWL 2 QL Knowledge BasesExchanging OWL 2 QL Knowledge Bases
Exchanging OWL 2 QL Knowledge Bases
 
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...
Mining Semi-structured Data: Understanding Web-tables – Building a Taxonomy f...
 
Extending DBpedia (LOD) using WikiTables
Extending DBpedia (LOD) using WikiTablesExtending DBpedia (LOD) using WikiTables
Extending DBpedia (LOD) using WikiTables
 
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...
Tailoring Temporal Description Logics for Reasoning over Temporal Conceptual ...
 
Managing Social Communities
Managing Social CommunitiesManaging Social Communities
Managing Social Communities
 
Data Exchange over RDF
Data Exchange over RDFData Exchange over RDF
Data Exchange over RDF
 
Exchanging more than Complete Data
Exchanging more than Complete DataExchanging more than Complete Data
Exchanging more than Complete Data
 
Exchanging More than Complete Data
Exchanging More than Complete DataExchanging More than Complete Data
Exchanging More than Complete Data
 
Exchanging More than Complete Data
Exchanging More than Complete DataExchanging More than Complete Data
Exchanging More than Complete Data
 
Answer-set programming
Answer-set programmingAnswer-set programming
Answer-set programming
 
Evolving web, evolving search
Evolving web, evolving searchEvolving web, evolving search
Evolving web, evolving search
 
XSPARQL Tutorial
XSPARQL TutorialXSPARQL Tutorial
XSPARQL Tutorial
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
 

Kürzlich hochgeladen

How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 

Kürzlich hochgeladen (20)

How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 

Federation and Navigation in SPARQL 1.1

  • 1. Federation and Navigation in SPARQL 1.1 Jorge P´erez Assistant Professor Department of Computer Science Universidad de Chile
  • 2. Outline Basics of SPARQL Syntax and Semantics of SPARQL 1.0 What is new in SPARQL 1.1 Federation: SERVICE operator Syntax and Semantics Evaluation of SERVICE queries Navigation: Property Paths Navigating graphs with regular expressions The history of paths (in SPARQL 1.1 specification) Evaluation procedures and complexity
  • 4. SPARQL query language for RDF RDF Graph: fmeza@utalca.cl :name :email :phone :name :friendOf rgarrido@utalca.cl:email Federico Meza 35-446928 Ruth Garrido URI2 URI1 RDF-triples: (URI2, :email, rgarrido@utalca.cl)
  • 5. SPARQL query language for RDF RDF Graph: fmeza@utalca.cl :name :email :phone :name :friendOf rgarrido@utalca.cl:email Federico Meza 35-446928 Ruth Garrido URI2 URI1 RDF-triples: (URI2, :email, rgarrido@utalca.cl) SPARQL Query: SELECT ?N WHERE { ?X :name ?N . }
  • 6. SPARQL query language for RDF RDF Graph: fmeza@utalca.cl :name :email :phone :name :friendOf rgarrido@utalca.cl:email Federico Meza 35-446928 Ruth Garrido URI2 URI1 RDF-triples: (URI2, :email, rgarrido@utalca.cl) SPARQL Query: SELECT ?N ?E WHERE { ?X :name ?N . ?X :email ?E . }
  • 7. SPARQL query language for RDF RDF Graph: fmeza@utalca.cl :name :email :phone :name :friendOf rgarrido@utalca.cl:email Federico Meza 35-446928 Ruth Garrido URI2 URI1 RDF-triples: (URI2, :email, rgarrido@utalca.cl) SPARQL Query: SELECT ?N ?E WHERE { ?X :name ?N . ?X :email ?E . ?X :friendOf ?Y . ?Y :name "Ruth Garrido" }
  • 8. An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :Jeffrey D. Ullman :Anish Das Sarma :Jennifer Widom inAMW:2009 "Schema Design for ..." dc:creator dc:creator dc:creator dct:partOf dc:title swrc:series conf:amw <http://purl.org/dc/terms/> : <http://dblp.l3s.de/d2r/resource/authors/> conf: <http://dblp.l3s.de/d2r/resource/conferences/> inAMW: <http://dblp.l3s.de/d2r/resource/publications/conf/amw/> swrc: <http://swrc.ontoware.org/ontology#> dc: dct: <http://purl.org/dc/elements/1.1/>
  • 9. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC
  • 10. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author
  • 11. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { }
  • 12. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . }
  • 13. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . }
  • 14. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . }
  • 15. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . } A SPARQL query consists of a:
  • 16. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . } A SPARQL query consists of a: Head: Processing of the variables
  • 17. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . } A SPARQL query consists of a: Head: Processing of the variables Body: Pattern matching expression
  • 18. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC, and their Web pages if this information is available: SELECT ?Author ?WebPage WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . OPTIONAL { ?Author foaf:homePage ?WebPage . } }
  • 19. SPARQL: A Simple RDF Query Language Example: Authors that have published in ISWC, and their Web pages if this information is available: SELECT ?Author ?WebPage WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . OPTIONAL { ?Author foaf:homePage ?WebPage . } }
  • 20. But things can become more complex... Interesting features of pattern matching on graphs SELECT ?X1 ?X2 ... { P1 . P2 }
  • 21. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping SELECT ?X1 ?X2 ... {{ P1 . P2 } { P3 . P4 } }
  • 22. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping ◮ Optional parts SELECT ?X1 ?X2 ... {{ P1 . P2 OPTIONAL { P5 } } { P3 . P4 OPTIONAL { P7 } } }
  • 23. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping ◮ Optional parts ◮ Nesting SELECT ?X1 ?X2 ... {{ P1 . P2 OPTIONAL { P5 } } { P3 . P4 OPTIONAL { P7 OPTIONAL { P8 } } } }
  • 24. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping ◮ Optional parts ◮ Nesting ◮ Union of patterns SELECT ?X1 ?X2 ... {{{ P1 . P2 OPTIONAL { P5 } } { P3 . P4 OPTIONAL { P7 OPTIONAL { P8 } } } } UNION { P9 }}
  • 25. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping ◮ Optional parts ◮ Nesting ◮ Union of patterns ◮ Filtering ◮ ... ◮ + several new features in the upcoming version: federation, navigation SELECT ?X1 ?X2 ... {{{ P1 . P2 OPTIONAL { P5 } } { P3 . P4 OPTIONAL { P7 OPTIONAL { P8 } } } } UNION { P9 FILTER ( R ) }}
  • 26. But things can become more complex... Interesting features of pattern matching on graphs ◮ Grouping ◮ Optional parts ◮ Nesting ◮ Union of patterns ◮ Filtering ◮ ... ◮ + several new features in the upcoming version: federation, navigation SELECT ?X1 ?X2 ... {{{ P1 . P2 OPTIONAL { P5 } } { P3 . P4 OPTIONAL { P7 OPTIONAL { P8 } } } } UNION { P9 FILTER ( R ) }} What is the (formal) meaning of a general SPARQL query?
  • 27. Outline Basics of SPARQL Syntax and Semantics of SPARQL 1.0 What is new in SPARQL 1.1 Federation: SERVICE operator Syntax and Semantics Evaluation of SERVICE queries Navigation: Property Paths Navigating graphs with regular expressions The history of paths (in SPARQL 1.1 specification) Evaluation procedures and complexity
  • 28. RDF triples and graphs Subject Object Predicate LB U U UB U : set of URIs B : set of blank nodes L : set of literals
  • 29. RDF triples and graphs Subject Object Predicate LB U U UB U : set of URIs B : set of blank nodes L : set of literals (s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple
  • 30. RDF triples and graphs Subject Object Predicate LB U U UB U : set of URIs B : set of blank nodes L : set of literals (s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple A set of RDF triples is called an RDF graph
  • 31. RDF triples and graphs Subject Object Predicate LB U U UB U : set of URIs B : set of blank nodes L : set of literals (s, p, o) ∈ (U ∪ B) × U × (U ∪ B ∪ L) is called an RDF triple A set of RDF triples is called an RDF graph In this talk, we do not consider blank nodes ◮ (s, p, o) ∈ U × U × (U ∪ L) is called an RDF triple
  • 32. A standard algebraic syntax ◮ Triple patterns: just RDF triples + variables (from a set V ) ?X :name "john" (?X, name, john)
  • 33. A standard algebraic syntax ◮ Triple patterns: just RDF triples + variables (from a set V ) ?X :name "john" (?X, name, john) ◮ Graph patterns: full parenthesized algebra original SPARQL syntax algebraic syntax { P1 . P2 } ( P1 AND P2 )
  • 34. A standard algebraic syntax ◮ Triple patterns: just RDF triples + variables (from a set V ) ?X :name "john" (?X, name, john) ◮ Graph patterns: full parenthesized algebra original SPARQL syntax algebraic syntax { P1 . P2 } ( P1 AND P2 ) { P1 OPTIONAL { P2 }} ( P1 OPT P2 )
  • 35. A standard algebraic syntax ◮ Triple patterns: just RDF triples + variables (from a set V ) ?X :name "john" (?X, name, john) ◮ Graph patterns: full parenthesized algebra original SPARQL syntax algebraic syntax { P1 . P2 } ( P1 AND P2 ) { P1 OPTIONAL { P2 }} ( P1 OPT P2 ) { P1 } UNION { P2 } ( P1 UNION P2 )
  • 36. A standard algebraic syntax ◮ Triple patterns: just RDF triples + variables (from a set V ) ?X :name "john" (?X, name, john) ◮ Graph patterns: full parenthesized algebra original SPARQL syntax algebraic syntax { P1 . P2 } ( P1 AND P2 ) { P1 OPTIONAL { P2 }} ( P1 OPT P2 ) { P1 } UNION { P2 } ( P1 UNION P2 ) { P1 FILTER ( R ) } ( P1 FILTER R )
  • 37. A standard algebraic syntax (cont.) ◮ Explicit precedence/association Example { t1 t2 OPTIONAL { t3 } OPTIONAL { t4 } t5 } ( ( ( ( t1 AND t2 ) OPT t3 ) OPT t4 ) AND t5 )
  • 38. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L
  • 39. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t:
  • 40. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t: ◮ µ(t): triple obtained from t replacing variables according to µ
  • 41. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t: ◮ µ(t): triple obtained from t replacing variables according to µ Example
  • 42. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t: ◮ µ(t): triple obtained from t replacing variables according to µ Example µ = {?X → R1, ?Y → R2, ?Name → john}
  • 43. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t: ◮ µ(t): triple obtained from t replacing variables according to µ Example µ = {?X → R1, ?Y → R2, ?Name → john} t = (?X, name, ?Name)
  • 44. Mappings: building block for the semantics Definition A mapping is a partial function from variables to RDF terms µ : V → U ∪ L Given a mapping µ and a triple pattern t: ◮ µ(t): triple obtained from t replacing variables according to µ Example µ = {?X → R1, ?Y → R2, ?Name → john} t = (?X, name, ?Name) µ(t) = (R1, name, john)
  • 45. The semantics of triple patterns Definition The evaluation of triple patter t over a graph G, denoted by t G , is the set of all mappings µ such that:
  • 46. The semantics of triple patterns Definition The evaluation of triple patter t over a graph G, denoted by t G , is the set of all mappings µ such that: ◮ dom(µ) is exactly the set of variables occurring in t
  • 47. The semantics of triple patterns Definition The evaluation of triple patter t over a graph G, denoted by t G , is the set of all mappings µ such that: ◮ dom(µ) is exactly the set of variables occurring in t ◮ µ(t) ∈ G
  • 48. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (?X, name, ?N) G
  • 49. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (?X, name, ?N) G µ1 = {?X → R1, ?N → john} µ2 = {?X → R2, ?N → paul}
  • 50. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (?X, name, ?N) G µ1 = {?X → R1, ?N → john} µ2 = {?X → R2, ?N → paul} (?X, email, ?E) G
  • 51. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (?X, name, ?N) G µ1 = {?X → R1, ?N → john} µ2 = {?X → R2, ?N → paul} (?X, email, ?E) G µ = {?X → R1, ?E → J@ed.ex}
  • 52. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (?X, name, ?N) G ?X ?N µ1 R1 john µ2 R2 paul (?X, email, ?E) G ?X ?E µ R1 J@ed.ex
  • 53. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (R1, webPage, ?W ) G (R2, name, paul) G (R3, name, ringo) G
  • 54. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (R1, webPage, ?W ) G (R2, name, paul) G (R3, name, ringo) G
  • 55. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (R1, webPage, ?W ) G (R2, name, paul) G (R3, name, ringo) G
  • 56. Example G (R1, name, john) (R1, email, J@ed.ex) (R2, name, paul) (R1, webPage, ?W ) G (R2, name, paul) G µ∅ = { } (R3, name, ringo) G
  • 57. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables:
  • 58. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2).
  • 59. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping.
  • 60. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2
  • 61. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2
  • 62. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2 µ1 ∪ µ2 R1 john J@edu.ex
  • 63. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2 µ1 ∪ µ2 R1 john J@edu.ex
  • 64. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2 µ1 ∪ µ2 R1 john J@edu.ex µ1 ∪ µ3 R1 john P@edu.ex R2
  • 65. Compatible mappings: mappings that can be merged. Definition The mappings µ1, µ2 are compatibles iff they agree in their shared variables: ◮ µ1(?X) = µ2(?X) for every ?X ∈ dom(µ1) ∩ dom(µ2). µ1 ∪ µ2 is also a mapping. Example ?X ?Y ?U ?V µ1 R1 john µ2 R1 J@edu.ex µ3 P@edu.ex R2 µ1 ∪ µ2 R1 john J@edu.ex µ1 ∪ µ3 R1 john P@edu.ex R2 µ∅ = { } is compatible with every mapping.
  • 66. Sets of mappings and operations Let M1 and M2 be sets of mappings: Definition Join: M1 ⋊⋉ M2 ◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles} ◮ extending mappings in M1 with compatible mappings in M2 will be used to define AND
  • 67. Sets of mappings and operations Let M1 and M2 be sets of mappings: Definition Join: M1 ⋊⋉ M2 ◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles} ◮ extending mappings in M1 with compatible mappings in M2 will be used to define AND Definition Union: M1 ∪ M2 ◮ {µ | µ ∈ M1 or µ ∈ M2} ◮ mappings in M1 plus mappings in M2 (the usual set union) will be used to define UNION
  • 68. Sets of mappings and operations Definition Difference: M1 M2 ◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles} ◮ mappings in M1 that cannot be extended with mappings in M2
  • 69. Sets of mappings and operations Definition Difference: M1 M2 ◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles} ◮ mappings in M1 that cannot be extended with mappings in M2 Definition Left outer join: M1 M2 = (M1 ⋊⋉ M2) ∪ (M1 M2) ◮ extension of mappings in M1 with compatible mappings in M2 ◮ plus the mappings in M1 that cannot be extended. will be used to define OPT
  • 70. Semantics of general graph patterns Definition Given a graph G the evaluation of a pattern is recursively defined the base case is the evaluation of a triple pattern.
  • 71. Semantics of general graph patterns Definition Given a graph G the evaluation of a pattern is recursively defined ◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G the base case is the evaluation of a triple pattern.
  • 72. Semantics of general graph patterns Definition Given a graph G the evaluation of a pattern is recursively defined ◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G ◮ (P1 UNION P2) G = P1 G ∪ P2 G the base case is the evaluation of a triple pattern.
  • 73. Semantics of general graph patterns Definition Given a graph G the evaluation of a pattern is recursively defined ◮ (P1 AND P2) G = P1 G ⋊⋉ P2 G ◮ (P1 UNION P2) G = P1 G ∪ P2 G ◮ (P1 OPT P2) G = P1 G P2 G the base case is the evaluation of a triple pattern.
  • 74. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G
  • 75. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G (?X, name, ?N) G ⋊⋉ (?X, email, ?E) G
  • 76. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G (?X, name, ?N) G ⋊⋉ (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo
  • 77. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G (?X, name, ?N) G ⋊⋉ (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex
  • 78. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G (?X, name, ?N) G ⋊⋉ (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ⋊⋉ ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex
  • 79. Example (AND) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) AND (?X, email, ?E)) G (?X, name, ?N) G ⋊⋉ (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ⋊⋉ ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex
  • 80. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G
  • 81. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G
  • 82. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo
  • 83. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex
  • 84. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex
  • 85. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul
  • 86. Example (OPT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) OPT (?X, email, ?E)) G (?X, name, ?N) G (?X, email, ?E) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?X ?E µ4 R1 J@ed.ex µ5 R3 R@ed.ex ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul
  • 87. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G
  • 88. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G (?X, email, ?Info) G ∪ (?X, webPage, ?Info) G
  • 89. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G (?X, email, ?Info) G ∪ (?X, webPage, ?Info) G ?X ?Info µ1 R1 J@ed.ex µ2 R3 R@ed.ex
  • 90. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G (?X, email, ?Info) G ∪ (?X, webPage, ?Info) G ?X ?Info µ1 R1 J@ed.ex µ2 R3 R@ed.ex ?X ?Info µ3 R3 www.ringo.com
  • 91. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G (?X, email, ?Info) G ∪ (?X, webPage, ?Info) G ?X ?Info µ1 R1 J@ed.ex µ2 R3 R@ed.ex ∪ ?X ?Info µ3 R3 www.ringo.com
  • 92. Example (UNION) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, email, ?Info) UNION (?X, webPage, ?Info)) G (?X, email, ?Info) G ∪ (?X, webPage, ?Info) G ?X ?Info µ1 R1 J@ed.ex µ2 R3 R@ed.ex ∪ ?X ?Info µ3 R3 www.ringo.com ?X ?Info µ1 R1 J@ed.ex µ2 R3 R@ed.ex µ3 R3 www.ringo.com
  • 93. Boolean filter expressions (value constraints) In filter expressions we consider ◮ the equality = among variables and RDF terms ◮ a unary predicate bound ◮ boolean combinations (∧, ∨, ¬) A mapping µ satisfies ◮ ?X = c if µ(?X) = c ◮ ?X =?Y if µ(?X) = µ(?Y ) ◮ bound(?X) if µ is defined in ?X, i.e. ?X ∈ dom(µ)
  • 94. Satisfaction of value constraints ◮ If P is a graph pattern and R is a value constraint then (P FILTER R) is also a graph pattern.
  • 95. Satisfaction of value constraints ◮ If P is a graph pattern and R is a value constraint then (P FILTER R) is also a graph pattern. Definition Given a graph G ◮ (P FILTER R) G = {µ ∈ P G | µ satisfies R} i.e. mappings in the evaluation of P that satisfy R.
  • 96. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G
  • 97. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo
  • 98. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?N = ringo ∨ ?N = paul
  • 99. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) ((?X, name, ?N) FILTER (?N = ringo ∨ ?N = paul)) G ?X ?N µ1 R1 john µ2 R2 paul µ3 R3 ringo ?N = ringo ∨ ?N = paul ?X ?N µ2 R2 paul µ3 R3 ringo
  • 100. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G
  • 101. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul
  • 102. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul ¬ bound(?E)
  • 103. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul ¬ bound(?E) ?X ?N µ2 R2 paul
  • 104. Example (FILTER) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (((?X, name, ?N) OPT (?X, email, ?E)) FILTER ¬ bound(?E)) G ?X ?N ?E µ1 ∪ µ4 R1 john J@ed.ex µ3 ∪ µ5 R3 ringo R@ed.ex µ2 R2 paul ¬ bound(?E) ?X ?N µ2 R2 paul (a non-monotonic query)
  • 105. Why do we need/want to formalize SPARQL A formalization is beneficial ◮ clarifying corner cases ◮ helping in the implementation process ◮ providing solid foundations (we can actually prove properties!)
  • 106. The evaluation decision problem Evaluation problem for SPARQL patterns Input: A mapping µ, an RDF graph G a graph pattern P Output: Is the mapping µ in the evaluation of pattern P over the RDF graph G
  • 110. Complexity of the evaluation problem Theorem (PAG09) For patterns using only the AND operator, the evaluation problem is in P
  • 111. Complexity of the evaluation problem Theorem (PAG09) For patterns using only the AND operator, the evaluation problem is in P Theorem (SML10) For patterns using AND and UNION operators, the evaluation problem is NP-complete.
  • 112. Complexity of the evaluation problem Theorem (PAG09) For patterns using only the AND operator, the evaluation problem is in P Theorem (SML10) For patterns using AND and UNION operators, the evaluation problem is NP-complete. Theorem (PAG09,SML10) For general patterns that include OPT operator, the evaluation problem is PSPACE-complete.
  • 113. Complexity of the evaluation problem Theorem (PAG09) For patterns using only the AND operator, the evaluation problem is in P Theorem (SML10) For patterns using AND and UNION operators, the evaluation problem is NP-complete. Theorem (PAG09,SML10) For general patterns that include OPT operator, the evaluation problem is PSPACE-complete. Good news: evaluation in P if the query is fixed (data complexity)
  • 114. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) if a variable occurs
  • 115. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ if a variable occurs inside B
  • 116. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT,
  • 117. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ⇑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT, then the variable must also occur inside A.
  • 118. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ⇑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT, then the variable must also occur inside A. Example (?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john)
  • 119. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ⇑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT, then the variable must also occur inside A. Example (?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john) ↑
  • 120. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ⇑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT, then the variable must also occur inside A. Example (?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john) ↑ ↑
  • 121. Well–designed patterns Can we find a natural fragment with better complexity? Definition An AND-OPT pattern is well–designed iff for every OPT in the pattern ( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · ) ↑ ⇑ ↑ ↑ if a variable occurs inside B and anywhere outside the OPT, then the variable must also occur inside A. Example (?Y , name, paul) OPT (?X, email, ?Z) AND (?X, name, john) ↑ ↑
  • 122. A bit more on complexity... PSPACE NP P
  • 123. A bit more on complexity... PSPACE coNPNP P
  • 124. Evaluation of well-designed patterns is in coNP-complete Theorem (PAG09) For AND-OPT well–designed graph patterns the evaluation problem is coNP-complete
  • 125. Evaluation of well-designed patterns is in coNP-complete Theorem (PAG09) For AND-OPT well–designed graph patterns the evaluation problem is coNP-complete Well-designed patterns also allow to study static analysis: Theorem (LPPS12) Equivalence of well-designed SPARQL patterns is in NP
  • 126. SELECT (a.k.a. projection) Besides graph patterns, SPARQL 1.0 allow result forms the most simple is SELECT Definition A SELECT query is an expression (SELECT W P) where P is a graph pattern and W is a set of variables, or *
  • 127. SELECT (a.k.a. projection) Besides graph patterns, SPARQL 1.0 allow result forms the most simple is SELECT Definition A SELECT query is an expression (SELECT W P) where P is a graph pattern and W is a set of variables, or * The evaluation of a SELECT query against G is ◮ (SELECT W P) G = {µ|W | µ ∈ P G } where µ|W is the restriction of µ to domain W . ◮ (SELECT * P) G = P G
  • 128. Example (SELECT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G
  • 129. Example (SELECT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G SELECT{?N, ?E} ?X ?N ?E µ1 R1 john J@ed.ex µ2 R3 ringo R@ed.ex
  • 130. Example (SELECT) G : (R1, name, john) (R2, name, paul) (R3, name, ringo) (R1, email, J@ed.ex) (R3, email, R@ed.ex) (R3, webPage, www.ringo.com) (SELECT {?N, ?E} ((?X, name, ?N) AND (?X, email, ?E))) G SELECT{?N, ?E} ?X ?N ?E µ1 R1 john J@ed.ex µ2 R3 ringo R@ed.ex ?N ?E µ1|{?N,?E} john J@ed.ex µ2|{?N,?E} ringo R@ed.ex
  • 131. SPARQL 1.1 introduces several new features In SPARQL 1.1: ◮ (SELECT W P) can be used as any other graph pattern (ASK P) can be used as a constraint in FILTER ⇒ sub-queries ◮ Aggregations via ORDER-BY plus COUNT, SUM, etc. ◮ More important for us: Federation and Navigation
  • 132. Outline Basics of SPARQL Syntax and Semantics of SPARQL 1.0 What is new in SPARQL 1.1 Federation: SERVICE operator Syntax and Semantics Evaluation of SERVICE queries Navigation: Property Paths Navigating graphs with regular expressions The history of paths (in SPARQL 1.1 specification) Evaluation procedures and complexity
  • 133. SPARQL endpoints and the SERVICE operator ◮ SPARQL endpoints are services that accepts HTTP requests asking for SPARQL queries ◮ http://www.w3.org/wiki/SparqlEndpoints lists some ◮ SPARQL 1.1 allows to mix local and remote queries to endpoints via the SERVICE operator
  • 134. SPARQL endpoints and the SERVICE operator ◮ SPARQL endpoints are services that accepts HTTP requests asking for SPARQL queries ◮ http://www.w3.org/wiki/SparqlEndpoints lists some ◮ SPARQL 1.1 allows to mix local and remote queries to endpoints via the SERVICE operator ?SELECT ?Author WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . }
  • 135. SPARQL endpoints and the SERVICE operator ◮ SPARQL endpoints are services that accepts HTTP requests asking for SPARQL queries ◮ http://www.w3.org/wiki/SparqlEndpoints lists some ◮ SPARQL 1.1 allows to mix local and remote queries to endpoints via the SERVICE operator ?SELECT ?Author ?Place WHERE { ?Paper dc:creator ?Author . ?Paper dct:partOf ?Conf . ?Conf swrc:series conf:iswc . SERVICE dbpedia:service { ?Author dbpedia:birthPlace ?Place . } }
  • 136. Syntax of SERVICE Syntax If c ∈ U, ?X is a variable, and P is a graph pattern then
  • 137. Syntax of SERVICE Syntax If c ∈ U, ?X is a variable, and P is a graph pattern then ◮ (SERVICE c P) ◮ (SERVICE ?X P) are SERVICE graph patterns.
  • 138. Syntax of SERVICE Syntax If c ∈ U, ?X is a variable, and P is a graph pattern then ◮ (SERVICE c P) ◮ (SERVICE ?X P) are SERVICE graph patterns. ◮ SERVICE graph pattern are included recursively in the algebra of graph patterns ◮ Variables are included in the SERVICE syntax to allow dynamic choosing of endpoints
  • 139. Semantics of SERVICE We assume the existence of a partial function ep : U → RDF graphs intuitively, ep(c) is the (default) graph associated to the SPARQL endpoint defined by c
  • 140. Semantics of SERVICE We assume the existence of a partial function ep : U → RDF graphs intuitively, ep(c) is the (default) graph associated to the SPARQL endpoint defined by c Definition Given an RDF graph G, an element c ∈ U, and a graph pattern P (SERVICE c P) G =
  • 141. Semantics of SERVICE We assume the existence of a partial function ep : U → RDF graphs intuitively, ep(c) is the (default) graph associated to the SPARQL endpoint defined by c Definition Given an RDF graph G, an element c ∈ U, and a graph pattern P (SERVICE c P) G = P ep(c) if c ∈ dom(ep)
  • 142. Semantics of SERVICE We assume the existence of a partial function ep : U → RDF graphs intuitively, ep(c) is the (default) graph associated to the SPARQL endpoint defined by c Definition Given an RDF graph G, an element c ∈ U, and a graph pattern P (SERVICE c P) G = P ep(c) if c ∈ dom(ep) {µ∅} otherwise
  • 143. Semantics of SERVICE We assume the existence of a partial function ep : U → RDF graphs intuitively, ep(c) is the (default) graph associated to the SPARQL endpoint defined by c Definition Given an RDF graph G, an element c ∈ U, and a graph pattern P (SERVICE c P) G = P ep(c) if c ∈ dom(ep) {µ∅} otherwise but the interesting case is when the endpoint is a variable...
  • 144. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G =
  • 145. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G = P ep(c)
  • 146. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G = P ep(c) ⋊⋉ {{?X → c}}
  • 147. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G = c∈dom(ep) P ep(c) ⋊⋉ {{?X → c}}
  • 148. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G = c∈dom(ep) P ep(c) ⋊⋉ {{?X → c}}
  • 149. Semantics of SERVICE Definition Given an RDF graph G, a variable ?X, and a graph pattern P (SERVICE ?X P) G = c∈dom(ep) P ep(c) ⋊⋉ {{?X → c}} can we effectively evaluate a SERVICE query?
  • 150. SERVICE example Some queries/patterns can be safely evaluated:
  • 151. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)))
  • 152. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))) ◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y ))
  • 153. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))) ◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y )) In both cases, the SERVICE variable is controlled by the data in the initial graph
  • 154. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))) ◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y )) In both cases, the SERVICE variable is controlled by the data in the initial graph There is natural way of evaluating the query:
  • 155. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))) ◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y )) In both cases, the SERVICE variable is controlled by the data in the initial graph There is natural way of evaluating the query: ◮ Evaluate first (?X, service address, ?Y )
  • 156. SERVICE example Some queries/patterns can be safely evaluated: ◮ ((?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))) ◮ ((SERVICE ?Y (?N, email, ?E)) AND (?X, service address, ?Y )) In both cases, the SERVICE variable is controlled by the data in the initial graph There is natural way of evaluating the query: ◮ Evaluate first (?X, service address, ?Y ) ◮ only for the obtained mappings evaluate the SERVICE on endpoint µ(?Y )
  • 157. Unbounded SERVICE queries What about this pattern? (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))
  • 158. Unbounded SERVICE queries What about this pattern? (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) :-(
  • 159. Unbounded SERVICE queries What about this pattern? (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) :-( Idea: force the SERVICE variable to be bound in every solution.
  • 160. Boundedness Definition A variable ?X is bound in graph pattern P if for every graph G and every µ ∈ P G it holds that: ◮ ?X ∈ dom(µ), and ◮ µ(?X) is a value in G.
  • 161. Boundedness Definition A variable ?X is bound in graph pattern P if for every graph G and every µ ∈ P G it holds that: ◮ ?X ∈ dom(µ), and ◮ µ(?X) is a value in G. We only need a procedure to ensure that every variable mentioned in SERVICE is bounded!
  • 162. Boundedness Definition A variable ?X is bound in graph pattern P if for every graph G and every µ ∈ P G it holds that: ◮ ?X ∈ dom(µ), and ◮ µ(?X) is a value in G. We only need a procedure to ensure that every variable mentioned in SERVICE is bounded! Oh wait...
  • 163. Boundedness Definition A variable ?X is bound in graph pattern P if for every graph G and every µ ∈ P G it holds that: ◮ ?X ∈ dom(µ), and ◮ µ(?X) is a value in G. We only need a procedure to ensure that every variable mentioned in SERVICE is bounded! Oh wait... Theorem (BAC11) The problem of checking if a variable is bound in a graph pattern is undecidable.
  • 164. Boundedness Definition A variable ?X is bound in graph pattern P if for every graph G and every µ ∈ P G it holds that: ◮ ?X ∈ dom(µ), and ◮ µ(?X) is a value in G. We only need a procedure to ensure that every variable mentioned in SERVICE is bounded! Oh wait... Theorem (BAC11) The problem of checking if a variable is bound in a graph pattern is undecidable. :-(
  • 165. Undecidability of boundedness Proof idea ◮ From [AG08]: it is undecidable to check if a SPARQL pattern P is satisfiable (if P G = ∅ for some G).
  • 166. Undecidability of boundedness Proof idea ◮ From [AG08]: it is undecidable to check if a SPARQL pattern P is satisfiable (if P G = ∅ for some G). ◮ Assume P does not mention ?X, and let Q = ((?X, ?Y , ?Z) UNION P):
  • 167. Undecidability of boundedness Proof idea ◮ From [AG08]: it is undecidable to check if a SPARQL pattern P is satisfiable (if P G = ∅ for some G). ◮ Assume P does not mention ?X, and let Q = ((?X, ?Y , ?Z) UNION P): ?X is bound in Q iff P is not satisfiable.
  • 168. Undecidability of boundedness Proof idea ◮ From [AG08]: it is undecidable to check if a SPARQL pattern P is satisfiable (if P G = ∅ for some G). ◮ Assume P does not mention ?X, and let Q = ((?X, ?Y , ?Z) UNION P): ?X is bound in Q iff P is not satisfiable. Undecidable: the SPARQL engine cannot check for boundedness...
  • 169. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then
  • 170. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)
  • 171. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then
  • 172. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1)
  • 173. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then
  • 174. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)
  • 175. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then
  • 176. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)
  • 177. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1) ◮ if P is a SERVICE pattern, then
  • 178. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1) ◮ if P is a SERVICE pattern, then SB(P) = ∅
  • 179. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1) ◮ if P is a SERVICE pattern, then SB(P) = ∅ Proposition (BAC11) If ?X ∈ SB(P) then ?X is bound in P.
  • 180. A sufficient condition for boundedness Definition [BAC11] The set of strongly bounded variables in a pattern P, denoted by SB(P) is defined recursively as follows. ◮ if P is a triple pattern t, then SB(P) = var(t) ◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2) ◮ if P = (P1 OPT P2), then SB(P) = SB(P1) ◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2) ◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1) ◮ if P is a SERVICE pattern, then SB(P) = ∅ Proposition (BAC11) If ?X ∈ SB(P) then ?X is bound in P. (SB(P) can be efficiently computed)
  • 181. (Strongly) boundedness is not enough Are we happy now? P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) .
  • 182. (Strongly) boundedness is not enough Are we happy now? P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) . ◮ ?Y is not bound in P1 (nor strongly bound)
  • 183. (Strongly) boundedness is not enough Are we happy now? P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) . ◮ ?Y is not bound in P1 (nor strongly bound) ◮ nevertheless there is a natural evaluation of this pattern
  • 184. (Strongly) boundedness is not enough Are we happy now? P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) . ◮ ?Y is not bound in P1 (nor strongly bound) ◮ nevertheless there is a natural evaluation of this pattern (?Y is bounded in the important part!)
  • 185. (Strongly) boundedness is not enough Are we happy now?
  • 186. (Strongly) boundedness is not enough Are we happy now? P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F)) .
  • 187. (Strongly) boundedness is not enough Are we happy now? P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F)) . ◮ ?U1 and ?U2 are strongly bounded, but
  • 188. (Strongly) boundedness is not enough Are we happy now? P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F)) . ◮ ?U1 and ?U2 are strongly bounded, but ◮ can we effectively evaluate this query? (?U2 is unbounded in the important part!)
  • 189. (Strongly) boundedness is not enough Are we happy now? P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F)) . ◮ ?U1 and ?U2 are strongly bounded, but ◮ can we effectively evaluate this query? (?U2 is unbounded in the important part!) we need to define what the important part is...
  • 190. Parse tree of a pattern We need first to formalize the tree of subexpressions of a pattern (?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))
  • 191. Parse tree of a pattern We need first to formalize the tree of subexpressions of a pattern (?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)) u6 : (?Y , a, ?Z) u1 : ((?Y , a, ?Z) UNION ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)))) u2 : (?Y , a, ?Z) u3 : ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))) u4 : (?X, b, c) u5 : (SERVICE ?X (?Y , a, ?Z))
  • 192. Parse tree of a pattern We need first to formalize the tree of subexpressions of a pattern (?Y , a, ?Z) UNION (?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)) u6 : (?Y , a, ?Z) u1 : ((?Y , a, ?Z) UNION ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z)))) u2 : (?Y , a, ?Z) u3 : ((?X, b, c) AND (SERVICE ?X (?Y , a, ?Z))) u4 : (?X, b, c) u5 : (SERVICE ?X (?Y , a, ?Z)) We denote by T (P) the tree of subexpressions of P.
  • 193. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that:
  • 194. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is bound in P2, and
  • 195. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is bound in P2, and 2. P1 is service-bound.
  • 196. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is bound in P2, and 2. P1 is service-bound. Unfortunately... (and not so surprisingly anymore)
  • 197. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is bound in P2, and 2. P1 is service-bound. Unfortunately... (and not so surprisingly anymore) Theorem (BAC11) Checking if a pattern is service-bound is undecidable.
  • 198. Service-boundedness Notion of boundedness considering the important part Definition A pattern P is service-bound if for every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X is bound in P2, and 2. P1 is service-bound. Unfortunately... (and not so surprisingly anymore) Theorem (BAC11) Checking if a pattern is service-bound is undecidable. Exercise: prove the theorem.
  • 199. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion.
  • 200. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion. Definition A pattern P is service-safe if of every node u in T (P) with label (SERVICE ?X P1) it holds that:
  • 201. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion. Definition A pattern P is service-safe if of every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X ∈ SB(P2), and
  • 202. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion. Definition A pattern P is service-safe if of every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X ∈ SB(P2), and 2. P1 is service-safe.
  • 203. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion. Definition A pattern P is service-safe if of every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X ∈ SB(P2), and 2. P1 is service-safe. Proposition (BAC11) If P is service-safe the it is service-bound.
  • 204. Service-safeness We need a decidable sufficient condition. Idea: replace bound by SB(·) in the previous notion. Definition A pattern P is service-safe if of every node u in T (P) with label (SERVICE ?X P1) it holds that: 1. There exists an ancestor of u in T (P) with label P2 s.t. ?X ∈ SB(P2), and 2. P1 is service-safe. Proposition (BAC11) If P is service-safe the it is service-bound. We finally have our desired decidable condition.
  • 205. Applying service-safeness P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E))
  • 206. Applying service-safeness P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) is service-safe.
  • 207. Applying service-safeness P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) is service-safe. P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F))
  • 208. Applying service-safeness P1 = (?X, service description, ?Z) UNION (?X, service address, ?Y ) AND (SERVICE ?Y (?N, email, ?E)) is service-safe. P2 = (?U1, related with, ?U2) AND SERVICE ?U1 (?N, email, ?E) OPT (SERVICE ?U2 (?N, phone, ?F)) is not service-safe.
  • 209. Closing words on SERVICE SERVICE: several interesting research question ◮ Optimization (rewriting, reordering, containment?) ◮ Cost analysis: in practice we cannot query all that we may want ◮ Different endpoints provide different completeness certificates (regarding the data they return) ◮ Several implementations challenges
  • 210. Outline Basics of SPARQL Syntax and Semantics of SPARQL 1.0 What is new in SPARQL 1.1 Federation: SERVICE operator Syntax and Semantics Evaluation of SERVICE queries Navigation: Property Paths Navigating graphs with regular expressions The history of paths (in SPARQL 1.1 specification) Evaluation procedures and complexity
  • 211. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2)
  • 212. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections?
  • 213. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic
  • 214. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic You (should) already know that Datalog can express this query.
  • 215. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic You (should) already know that Datalog can express this query. We can consider a new predicate reach and the program
  • 216. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic You (should) already know that Datalog can express this query. We can consider a new predicate reach and the program (?X, reach, ?Y ) ← (?X, connected, ?Y )
  • 217. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic You (should) already know that Datalog can express this query. We can consider a new predicate reach and the program (?X, reach, ?Y ) ← (?X, connected, ?Y ) (?X, reach, ?Y ) ← (?X, reach, ?Z), (?Z, connected, ?Y )
  • 218. SPARQL 1.0 has very limited navigational capabilities Assume a graph with cities and connections with RDF triples like: (C1, connected, C2) query: is city B reachable from A by a sequence of connections? ◮ Known fact: SPARQL 1.0 cannot express this query! ◮ Follows easily from locality of FO-logic You (should) already know that Datalog can express this query. We can consider a new predicate reach and the program (?X, reach, ?Y ) ← (?X, connected, ?Y ) (?X, reach, ?Y ) ← (?X, reach, ?Z), (?Z, connected, ?Y ) but do we want to integrate Datalog and SPARQL?
  • 219. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf
  • 220. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X :friendOf ?Y . ?Y :name "Maria" . }
  • 221. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X :friendOf ?Y . ?Y :name "Maria" . }
  • 222. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X (:friendOf)* ?Y . ?Y :name "Maria" . }
  • 223. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X (:friendOf)* ?Y . ?Y :name "Maria" . }
  • 224. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X (:friendOf)* ?Y . ← SPARQL 1.1 property path ?Y :name "Maria" . }
  • 225. SPARQL 1.1 provides an alternative way for navigating URI2 :email seba@puc.cl :name :email :phone :name :friendOf juan@utexas.edu URI1 Seba 446928888 Juan Claudio :name :name Maria URI0 :friendOf URI3 :friendOf SELECT ?X WHERE { ?X (:friendOf)* ?Y . ← SPARQL 1.1 property path ?Y :name "Maria" . } Idea: navigate RDF graphs using regular expressions
  • 226. General navigation using regular expressions Regular expressions define sets of strings using ◮ concatenation: / ◮ disjunction: | ◮ Kleene star: * Example Consider strings composed of symbols a and b a/(b)*/a defines strings of the form abbb · · · bbba.
  • 227. General navigation using regular expressions Regular expressions define sets of strings using ◮ concatenation: / ◮ disjunction: | ◮ Kleene star: * Example Consider strings composed of symbols a and b a/(b)*/a defines strings of the form abbb · · · bbba. Idea: use regular expressions to define paths ◮ a path p satisfies a regular expression r if the string composed of the sequence of edges of p satisfies expression r
  • 228. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John
  • 229. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X }
  • 230. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection
  • 231. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X Paris }
  • 232. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)* Paris }
  • 233. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)*/:bus Paris }
  • 234. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }
  • 235. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)*/:bus/(:train|:plane)* Paris } Exercise: cities that reach Paris with an even number of connections
  • 236. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)*/:bus/(:train|:plane)* Paris } Exercise: cities that reach Paris with an even number of connections Mixing regular expressions and SPARQL operators gives interesting expressive power: Persons in my professional network that attended the same school
  • 237. Interesting navigational queries ◮ RDF graph with :father, :mother edges: ancestors of John { John (:father|:mother)* ?X } ◮ Connections between cities via :train, :bus, :plane Cities that reach Paris with exactly one :bus connection { ?X (:train|:plane)*/:bus/(:train|:plane)* Paris } Exercise: cities that reach Paris with an even number of connections Mixing regular expressions and SPARQL operators gives interesting expressive power: Persons in my professional network that attended the same school { ?X (:conn)* ?Y . ?X (:conn)* ?Z . ?Y :sameSchool ?Z }
  • 238. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries
  • 239. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths
  • 240. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths ◮ Late 2010: several discussions on possible drawbacks on the non-standard definition by the W3C
  • 241. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths ◮ Late 2010: several discussions on possible drawbacks on the non-standard definition by the W3C ◮ Early 2011: first formal semantics by the W3C
  • 242. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths ◮ Late 2010: several discussions on possible drawbacks on the non-standard definition by the W3C ◮ Early 2011: first formal semantics by the W3C ◮ Late 2011: empirical and theoretical study on SPARQL 1.1 property paths showing unfeasibility of evaluation
  • 243. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths ◮ Late 2010: several discussions on possible drawbacks on the non-standard definition by the W3C ◮ Early 2011: first formal semantics by the W3C ◮ Late 2011: empirical and theoretical study on SPARQL 1.1 property paths showing unfeasibility of evaluation ◮ Mid 2012: semantics change to overcome the raised issues
  • 244. As always, we need a (formal) semantics ◮ Regular expressions in SPARQL queries seem reasonable ◮ We need to agree in the meaning of these new queries A bit of history on W3C standardization of property paths: ◮ Mid 2010: W3C defines an informal semantics for paths ◮ Late 2010: several discussions on possible drawbacks on the non-standard definition by the W3C ◮ Early 2011: first formal semantics by the W3C ◮ Late 2011: empirical and theoretical study on SPARQL 1.1 property paths showing unfeasibility of evaluation ◮ Mid 2012: semantics change to overcome the raised issues The following experimental study is based on [ACP12].
  • 245. SPARQL 1.1 implementations had a poor performance Data: ◮ cliques (complete graphs) of different size ◮ from 2 nodes (87 bytes) to 13 nodes (970 bytes) :p :a0 :a1 :a3 :p :p :p :p :p :a2 RDF clique with 4 nodes (127 bytes)
  • 246. SPARQL 1.1 implementations had a poor performance 1 10 100 1000 2 4 6 8 10 12 14 16 ARQ + + + + + + + + + + + + RDFQ × × × × × × × × × × KGram ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ Sesame [ACP12] SELECT * WHERE { :a0 (:p)* :a1 }
  • 247. Poor performance with real Web data of small size Data: ◮ Social Network data given by foaf:knows links ◮ Crawled from Axel Polleres’ foaf document (3 steps) ◮ Different documents, deleting some nodes foaf:knows axel:me ivan:me bizer:chris richard:cygri ... · · · andreas:ah · · · · · ·
  • 248. Poor performance with real Web data of small size SELECT * WHERE { axel:me (foaf:knows)* ?x }
  • 249. Poor performance with real Web data of small size SELECT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame 9.2KB 5.13 75.70 313.37 – 10.9KB 8.20 325.83 – – 11.4KB 65.87 – – – 13.2KB 292.43 – – – 14.8KB – – – – 17.2KB – – – – 20.5KB – – – – 25.8KB – – – – [ACP12] (time in seconds, timeout = 1hr)
  • 250. Poor performance with real Web data of small size SELECT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame 9.2KB 5.13 75.70 313.37 – 10.9KB 8.20 325.83 – – 11.4KB 65.87 – – – 13.2KB 292.43 – – – 14.8KB – – – – 17.2KB – – – – 20.5KB – – – – 25.8KB – – – – [ACP12] (time in seconds, timeout = 1hr) Is this a problem of these particular implementations?
  • 251. This is a problem of the specification [ACP12] Any implementation that follows SPARQL 1.1 standard (as of January 2012) is doomed to show the same behavior
  • 252. This is a problem of the specification [ACP12] Any implementation that follows SPARQL 1.1 standard (as of January 2012) is doomed to show the same behavior The main sources of complexity is counting
  • 253. This is a problem of the specification [ACP12] Any implementation that follows SPARQL 1.1 standard (as of January 2012) is doomed to show the same behavior The main sources of complexity is counting Impact on W3C standard: ◮ Standard semantics of SPARQL 1.1 property paths was changed in July 2012 to overcome these issues
  • 254. SPARQL 1.1 property paths match regular expressions but also count :a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X }
  • 255. SPARQL 1.1 property paths match regular expressions but also count :a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X } ?X :a :b :c :d :d
  • 256. SPARQL 1.1 property paths match regular expressions but also count :a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X } ?X :a :b :c :d :d
  • 257. SPARQL 1.1 property paths match regular expressions but also count :p:a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X } ?X :a :b :c :d :d
  • 258. SPARQL 1.1 property paths match regular expressions but also count :p:a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X } ?X :a :b :c :d :d :c :d
  • 259. SPARQL 1.1 property paths match regular expressions but also count :p:a :b :c :p :p :p :p :d SELECT ?X WHERE { :a (:p)* ?X } ?X :a :b :c :d :d :c :d But what if we have cycles?
  • 260. SPARQL 1.1 document provides a special procedure to handle cycles (and make the count) Evaluation of path* “the algorithm extends the multiset of results by one application of path. If a node has been visited for path, it is not a candidate for another step. A node can be visited multiple times if different paths visit it.” SPARQL 1.1 Last Call (Jan 2012)
  • 261. SPARQL 1.1 document provides a special procedure to handle cycles (and make the count) Evaluation of path* “the algorithm extends the multiset of results by one application of path. If a node has been visited for path, it is not a candidate for another step. A node can be visited multiple times if different paths visit it.” SPARQL 1.1 Last Call (Jan 2012) ◮ W3C document provides a procedure (ArbitraryLengthPath) ◮ This procedure was formalized in [ACP12]
  • 262. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 263. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } n # Sol. 9 13,700 10 109,601 11 986,410 12 9,864,101 13 – every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 264. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } n # Sol. 9 13,700 10 109,601 11 986,410 12 9,864,101 13 – every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 265. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } n # Sol. 9 13,700 10 109,601 11 986,410 12 9,864,101 13 – n # Sol 2 1 3 6 4 305 5 418,576 6 – every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 266. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 } n # Sol. 9 13,700 10 109,601 11 986,410 12 9,864,101 13 – n # Sol 2 1 3 6 4 305 5 418,576 6 – every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 267. Counting the number of solutions... Data: Clique of size n { :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 } n # Sol. 9 13,700 10 109,601 11 986,410 12 9,864,101 13 – n # Sol 2 1 3 6 4 305 5 418,576 6 – n # Sol. 2 1 3 42 4 – every solution is a copy of the empty mapping µ∅ (| | in ARQ)
  • 268. More on counting the number of solutions... Data: foaf links crawled from the Web { axel:me (foaf:knows)* ?x }
  • 269. More on counting the number of solutions... Data: foaf links crawled from the Web { axel:me (foaf:knows)* ?x } File # URIs # Sol. Output Size 9.2KB 38 29,817 2MB 10.9KB 43 122,631 8.4MB 11.4KB 47 1,739,331 120MB 13.2KB 52 8,511,943 587MB 14.8KB 54 – –
  • 270. More on counting the number of solutions... Data: foaf links crawled from the Web { axel:me (foaf:knows)* ?x } File # URIs # Sol. Output Size 9.2KB 38 29,817 2MB 10.9KB 43 122,631 8.4MB 11.4KB 47 1,739,331 120MB 13.2KB 52 8,511,943 587MB 14.8KB 54 – – What is really happening here?
  • 271. More on counting the number of solutions... Data: foaf links crawled from the Web { axel:me (foaf:knows)* ?x } File # URIs # Sol. Output Size 9.2KB 38 29,817 2MB 10.9KB 43 122,631 8.4MB 11.4KB 47 1,739,331 120MB 13.2KB 52 8,511,943 587MB 14.8KB 54 – – What is really happening here? Theory can help!
  • 272. A bit more on complexity classes... Complexity can be measured by using counting-complexity classes NP #P Sat: is a propositional CountSat: how many assignments formula satisfiable? satisfy a propositional formula?
  • 273. A bit more on complexity classes... Complexity can be measured by using counting-complexity classes NP #P Sat: is a propositional CountSat: how many assignments formula satisfiable? satisfy a propositional formula? Formally A function f (·) on strings is in #P if there exists a polynomial-time non-deterministic TM M such that f (x) = number of accepting computations of M with input x
  • 274. A bit more on complexity classes... Complexity can be measured by using counting-complexity classes NP #P Sat: is a propositional CountSat: how many assignments formula satisfiable? satisfy a propositional formula? Formally A function f (·) on strings is in #P if there exists a polynomial-time non-deterministic TM M such that f (x) = number of accepting computations of M with input x ◮ CountSat is #P-complete
  • 275. Counting problem for property paths CountW3C Input: RDF graph G Property path triple { :a path :b } Output: Count the number of solutions of { :a path :b } over G (according to the semantics proposed by W3C)
  • 276. The complexity of property paths is intractable Theorem (ACP12) CountW3C is outside #P
  • 277. The complexity of property paths is intractable Theorem (ACP12) CountW3C is outside #P CountW3C is hard to solve even if P = NP
  • 278. A doubly exponential lower bound for counting ◮ Let paths be a property path of the form (· · · ((:p)*)*)· · ·)* with s nested stars
  • 279. A doubly exponential lower bound for counting ◮ Let paths be a property path of the form (· · · ((:p)*)*)· · ·)* with s nested stars ◮ Let Kn be a clique with n nodes
  • 280. A doubly exponential lower bound for counting ◮ Let paths be a property path of the form (· · · ((:p)*)*)· · ·)* with s nested stars ◮ Let Kn be a clique with n nodes ◮ Let CountClique(s, n) be the number of solutions of { :a0 paths :a1 } over Kn
  • 281. A doubly exponential lower bound for counting ◮ Let paths be a property path of the form (· · · ((:p)*)*)· · ·)* with s nested stars ◮ Let Kn be a clique with n nodes ◮ Let CountClique(s, n) be the number of solutions of { :a0 paths :a1 } over Kn Lemma (ACP12) CountClique(s, n) ≥ (n − 2)!(n−1)s−1
  • 282. A doubly exponential lower bound for counting ◮ Let paths be a property path of the form (· · · ((:p)*)*)· · ·)* with s nested stars ◮ Let Kn be a clique with n nodes ◮ Let CountClique(s, n) be the number of solutions of { :a0 paths :a1 } over Kn Lemma (ACP12) CountClique(s, n) ≥ (n − 2)!(n−1)s−1 In [ACP12]: A recursive formula for calculating CountClique(s, n)
  • 283. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – 7 – 8 –
  • 284. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – 7 – 8 –
  • 285. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – 7 – 8 –
  • 286. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – 7 – 8 –
  • 287. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – 7 – 8 –
  • 288. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – ← 28 × 109 7 – 8 –
  • 289. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – ← 28 × 109 7 – ← 144 × 1015 8 –
  • 290. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – ← 28 × 109 7 – ← 144 × 1015 8 – ← 79 × 1024
  • 291. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – ← 28 × 109 7 – ← 144 × 1015 8 – ← 79 × 1024 79 Yottabytes for the answer over a file of 379 bytes
  • 292. We can explain the experimental results CountClique(s, n) allows to fill in the blanks { :a0 ((:p)*)* :a1 } n # Sol. 2 1 3 6 4 305 5 418,576 6 – ← 28 × 109 7 – ← 144 × 1015 8 – ← 79 × 1024 79 Yottabytes for the answer over a file of 379 bytes 1 Yottabyte > the estimated capacity of all digital storage in the world
  • 293. Data complexity of property path is still intractable Common assumption in Databases: ◮ queries are much smaller than data sources
  • 294. Data complexity of property path is still intractable Common assumption in Databases: ◮ queries are much smaller than data sources Data complexity ◮ measure the complexity considering the query fixed
  • 295. Data complexity of property path is still intractable Common assumption in Databases: ◮ queries are much smaller than data sources Data complexity ◮ measure the complexity considering the query fixed ◮ Data complexity of SQL, XPath, SPARQL 1.0, etc. are all polynomial
  • 296. Data complexity of property path is still intractable Common assumption in Databases: ◮ queries are much smaller than data sources Data complexity ◮ measure the complexity considering the query fixed ◮ Data complexity of SQL, XPath, SPARQL 1.0, etc. are all polynomial Theorem (ACP12) Data complexity of CountW3C is #P-complete
  • 297. Data complexity of property path is still intractable Common assumption in Databases: ◮ queries are much smaller than data sources Data complexity ◮ measure the complexity considering the query fixed ◮ Data complexity of SQL, XPath, SPARQL 1.0, etc. are all polynomial Theorem (ACP12) Data complexity of CountW3C is #P-complete Corollary SPARQL 1.1 query evaluation (as of January 2012) is intractable in Data Complexity
  • 298. Existential semantics: a possible alternative Possible solution Do not count Just check whether there exists a path satisfying the property path expression
  • 299. Existential semantics: a possible alternative Possible solution Do not count Just check whether there exists a path satisfying the property path expression Years of experiences (theory and practice) in: ◮ Graph Databases ◮ XML ◮ SPARQL 1.0 (PSPARQL, Gleen) + equivalent regular expressions giving equivalent results
  • 300. Existential semantics: decision problems Input: RDF graph G Property path triple { :a path :b } ExistsPath Question: Is there a path from :a to :b in G satisfying the regular expression path? ExistsW3C Question: Is the number of solutions of { :a path :b } over G greater than 0 (according to W3C semantics)?
  • 301. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory:
  • 302. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state
  • 303. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state 2. construct from path an NFA Apath
  • 304. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state 2. construct from path an NFA Apath 3. construct the product automaton G × Apath:
  • 305. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state 2. construct from path an NFA Apath 3. construct the product automaton G × Apath: ◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath add a transition ((x, p), r, (y, q)) to G × Apath
  • 306. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state 2. construct from path an NFA Apath 3. construct the product automaton G × Apath: ◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath add a transition ((x, p), r, (y, q)) to G × Apath 4. check if we can go from (:a, q0) to (:b, qf ) in G × Apath
  • 307. Evaluating existential paths is tractable Theorem (well-known result) ExistsPath can be solved in O(|G| × |path|) Can be proved by using automata theory: 1. consider G as an NFA with :a initial state and :b final state 2. construct from path an NFA Apath 3. construct the product automaton G × Apath: ◮ whenever (x, r, y) ∈ G and (p, r, q) is a transition in Apath add a transition ((x, p), r, (y, q)) to G × Apath 4. check if we can go from (:a, q0) to (:b, qf ) in G × Apath with q0 initial state of Apath and qf some final state of Apath
  • 308. Relationship between ExistsPath and ExistsW3C Theorem (ACP12) ExistsPath and ExistsW3C are equivalent decision problems
  • 309. Relationship between ExistsPath and ExistsW3C Theorem (ACP12) ExistsPath and ExistsW3C are equivalent decision problems Corollary (ACP12) ExistsW3C can be solved in O(|G| × |path|)
  • 310. So there are possibilities for optimization SPARQL includes an operator to eliminate duplicates (DISTINCT)
  • 311. So there are possibilities for optimization SPARQL includes an operator to eliminate duplicates (DISTINCT) Corollary Property path queries with SELECT DISTINCT can be efficiently evaluated
  • 312. So there are possibilities for optimization SPARQL includes an operator to eliminate duplicates (DISTINCT) Corollary Property path queries with SELECT DISTINCT can be efficiently evaluated And we can also use DISTINCT over general queries Theorem SELECT DISTINCT SPARQL 1.1 queries are tractable in Data Complexity
  • 313. SPARQL 1.1 implementations do not take advantage of SELECT DISTINCT SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame Psparql Gleen
  • 314. SPARQL 1.1 implementations do not take advantage of SELECT DISTINCT SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame Psparql Gleen
  • 315. SPARQL 1.1 implementations do not take advantage of SELECT DISTINCT SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame Psparql Gleen 9.2KB 2.24 47.31 2.37 – 0.29 1.39 10.9KB 2.60 204.95 6.43 – 0.30 1.32 11.4KB 6.88 3222.47 80.73 – 0.30 1.34 13.2KB 24.42 – 394.61 – 0.31 1.38 14.8KB – – – – 0.33 1.38 17.2KB – – – – 0.35 1.42 20.5KB – – – – 0.44 1.50 25.8KB – – – – 0.45 1.52
  • 316. SPARQL 1.1 implementations do not take advantage of SELECT DISTINCT SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x } Input ARQ RDFQ Kgram Sesame Psparql Gleen 9.2KB 2.24 47.31 2.37 – 0.29 1.39 10.9KB 2.60 204.95 6.43 – 0.30 1.32 11.4KB 6.88 3222.47 80.73 – 0.30 1.34 13.2KB 24.42 – 394.61 – 0.31 1.38 14.8KB – – – – 0.33 1.38 17.2KB – – – – 0.35 1.42 20.5KB – – – – 0.44 1.50 25.8KB – – – – 0.45 1.52 Optimization possibilities can remain hidden in a complicated specification
  • 317. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression)
  • 318. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression) For example, consider path1, path2, and path3 not using * then when evaluating { :a path1/(path2)*/path3 :b } one should (intuitively):
  • 319. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression) For example, consider path1, path2, and path3 not using * then when evaluating { :a path1/(path2)*/path3 :b } one should (intuitively): ◮ consider all the paths from :a to some intermediate :c1 satisfying path1
  • 320. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression) For example, consider path1, path2, and path3 not using * then when evaluating { :a path1/(path2)*/path3 :b } one should (intuitively): ◮ consider all the paths from :a to some intermediate :c1 satisfying path1 ◮ check if there exists :c2 reachable from :c1 following (path2)*
  • 321. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression) For example, consider path1, path2, and path3 not using * then when evaluating { :a path1/(path2)*/path3 :b } one should (intuitively): ◮ consider all the paths from :a to some intermediate :c1 satisfying path1 ◮ check if there exists :c2 reachable from :c1 following (path2)* ◮ consider all the paths from :c2 to :b satisfying path3
  • 322. New semantics for property paths (July 2012) ◮ Paths constructed only from / and | should be counted ◮ As soon as * is used, duplicates are eliminated (from that part of the expression) For example, consider path1, path2, and path3 not using * then when evaluating { :a path1/(path2)*/path3 :b } one should (intuitively): ◮ consider all the paths from :a to some intermediate :c1 satisfying path1 ◮ check if there exists :c2 reachable from :c1 following (path2)* ◮ consider all the paths from :c2 to :b satisfying path3 ◮ make the count (to produce copies)
  • 323. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives?
  • 324. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide.
  • 325. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles
  • 326. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles ◮ count the number of simple paths
  • 327. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles ◮ count the number of simple paths ◮ does someone from the audience have another in mind?
  • 328. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles ◮ count the number of simple paths ◮ does someone from the audience have another in mind? In all the above cases you have to decide what exactly you count:
  • 329. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles ◮ count the number of simple paths ◮ does someone from the audience have another in mind? In all the above cases you have to decide what exactly you count: ◮ the number of paths satisfying the expression? ◮ the number of ways that the expr can be satisfied? (W3C)
  • 330. Is the new semantics the right one? W3C definitely wants to count paths. Are there more reasonable alternatives? ◮ to have different operators for counting (e.g. . and ||) so the user can decide. ◮ the honest approach: just make the count and output infininty in the presence of cycles ◮ count the number of simple paths ◮ does someone from the audience have another in mind? In all the above cases you have to decide what exactly you count: ◮ the number of paths satisfying the expression? ◮ the number of ways that the expr can be satisfied? (W3C) (they are not the same! consider for example (a|a) )
  • 331. Several work to do on navigational queries for graphs! Other lines of research with open questions: ◮ Nested regular expressions and nSPARQL [PAG09,BPR12] ◮ Queries that can output paths (e.g. ECRPQs [BLHW10]) ◮ More complexity results on counting paths [LM12] What is the right language (and the right semantics) for navigating RDF graphs?
  • 332. Outline Basics of SPARQL Syntax and Semantics of SPARQL 1.0 What is new in SPARQL 1.1 Federation: SERVICE operator Syntax and Semantics Evaluation of SERVICE queries Navigation: Property Paths Navigating graphs with regular expressions The history of paths (in SPARQL 1.1 specification) Evaluation procedures and complexity
  • 333. Concluding remarks ◮ Federation and Navigation are fundamental features in SPARQL 1.1 ◮ They need formalization and (serious) study
  • 334. Concluding remarks ◮ Federation and Navigation are fundamental features in SPARQL 1.1 ◮ They need formalization and (serious) study ◮ Do not runaway from Theory! it can really help (and has helped) to understand the implications of design decisions