Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
1. '
&
$
%
Modeling and Querying Metadata in the
Semantic Sensor Web: stRDF and stSPARQL
Manolis Koubarakis
Kostis Kyzirakos
Manos Karpathiotakis
Dept. of Informatics and Telecommunications
National and Kapodistrian University of Athens
Dagstuhl SSN 26/01/2010
2. '
&
$
%
Talk Outline
• Motivation
• The Data Model stRDF
• The Query Language stSPARQL
• Implementation
• Future Work
Dagstuhl SSN 26/01/2010
3. '
&
$
%
Motivation
• The vision of the Semantic Sensor Web: annotate sensor
data and services to enable discovery, integration,
interoperability etc. (Sheth et al. 2008, SemsorGrid4Env)
• Sensor annotations involve thematic, spatial and temporal
metadata. Examples:
– The sensor measures temperature. (thematic)
– The sensor is located in the location represented by point
(A, B). (spatial)
– The sensor measured −30
Celsius on 26/01/2010 at
03:00pm. (temporal)
• RDF can represent thematic metadata. How about spatial and
temporal metadata?
Dagstuhl SSN 26/01/2010
4. '
&
$
%
Previous Work
• Time in RDF (Gutierrez and colleagues, 2005).
• SPARQL-ST (Perry, 2008).
• SPAUK (Kollas, 2007)
Dagstuhl SSN 26/01/2010
5. '
&
$
%
Our Approach
• Use ideas from constraint databases (Kanellakis, Kuper and
Revesz, 1991). Slogan: What’s in a tuple? Constraints.
• Extend RDF to a constraint database model. Slogan: What’s
in a triple? Constraints.
• Extend SPARQL to a constraint query language.
• Follow exactly the approach of CSQL (Kuper et al., 1998).
• Use linear constraints to represent spatial and temporal
metadata.
Dagstuhl SSN 26/01/2010
6. '
&
$
%
Spatial Metadata
• We start with a FO language L = {≤, +} ∪ Q over the structure
Q = Q, ≤, +, (q)q∈Q .
• Atomic formulae: linear equations and inequalities of the form
(
p
i=1
aixi)Θa0
where Θ is one of =, ≤ or <.
• Semi-linear point sets: sets that can be defined by quantifier-free
formulas of L.
Dagstuhl SSN 26/01/2010
8. '
&
$
%
The sRDF data model
• Let I, B and L be the sets of IRIs, blank nodes and literals.
• Let Ck be the set of quantifier-free formulae of L with k free
variables (k = 1, 2, . . .).
• Let C be the infinite union C1 ∪ C2 ∪ · · ·.
• An sRDF triple is an element of the set
(I ∪ B) × I × (I ∪ B ∪ L ∪ C).
• An sRDF graph is a set of sRDF triples.
• sRDF can be realized as an extension of RDF with a new kind
of typed literals: quantifier-free formulae with linear
constraints. The datatype of these literals is
strdf:SemiLinearPointSet.
Dagstuhl SSN 26/01/2010
9. '
&
$
%
Example of sRDF graph
ex:s1 rdf:type, ex:Sensor .
ex:s1 ex:has_location "x=40 and y=15"^^strdf:SemiLinearPointSet .
Dagstuhl SSN 26/01/2010
10. '
&
$
%
Introducing Time
• Time structure: the set of rational numbers Q (i.e., time is
assumed to be linear, dense and unbounded).
• Temporal constraints are expressed by quantifier-free formulas
of the language L defined earlier, but their syntax is limited to
elements of the set C1.
• Atomic temporal constraints: formulas of L of the
following form: x ∼ c, where x is a variable, c is a rational
number and ∼ is <, ≤, ≥, >, = or =.
• Temporal constraints: Boolean combinations of atomic
temporal constraints using a single variable.
Dagstuhl SSN 26/01/2010
11. '
&
$
%
Example
1 5 10 12 15
t = 1 ∨ (t ≥ 5 ∧ t ≤ 10) ∨ (t ≥ 12 ∧ t ≤ 15)
Dagstuhl SSN 26/01/2010
12. '
&
$
%
The stRDF data model
stRDF extends sRDF with the ability to represent the valid time
of a triple:
• An stRDF quad (a, b, c, τ) is an sRDF triple (a, b, c) with a
fourth component τ which is a temporal constraint.
• An stRDF graph is a set of sRDF triples and stRDF quads.
Dagstuhl SSN 26/01/2010
13. '
&
$
%
Example of stRDF graph
ex:obs1Result om:uom ex:Celcius .
ex:obs1Result om:value "27"
"(10 <= t <= 11)"^^strdf:SemiLinearPointSet .
Dagstuhl SSN 26/01/2010
14. '
&
$
%
The Query Language stSPARQL
• Syntactic extension of SPARQL
• Formal semantics
• Closure
• Computability
• Low data complexity
• Efficient implementation
Dagstuhl SSN 26/01/2010
18. '
&
$
%
Queries - Dataset I
Spatial selection with OPTIONAL. Find the URIs of the sensors that
optionally has a grounding that has a location which is inside the rectangle
R(0, 0, 100, 100)?
select ?S ?GEO
where { ?S rdf:type ssn:Sensor .
optional {
?G rdf:type ssn:SensorGrounding .
?L rdf:type ssn:Location .
?S ssn:supports ?G .
?G ssn:haslocation ?L .
?L strdf:hasGeometry ?GEO .
filter(?GEO inside "0<=x<=100 and 0<=y<=100") } }
Dagstuhl SSN 26/01/2010
21. '
&
$
%
Example - Dataset II (cont’d)
Metadata about geographical areas:
ex:area1 rdf:type ex:RuralArea .
ex:area1 ex:hasName "Brovallen" .
ex:area1 strdf:hasGeometry
"(-x+1.363636y<-5.272576 and y<79 and -y<-13 and
x-0.090909y<131.818133) or (y<13 and x<133 and
-x-1.5y<-128.5)"^^strdf:SemiLinearPointSet .
Dagstuhl SSN 26/01/2010
22. '
&
$
%
Queries - Dataset II
Spatial and temporal selection. Find the values of all observations that
were valid at time 11 and the rural area they refer to.
select ?V ?RA
where { ?OBS rdf:type om:Observation .
?LOC rdf:type om:Location .
?R rdf:type om:ResultData .
?RA rdf:type ex:RuralArea .
?OBS om:observationLocation ?LOC .
?LOC strdf:hasGeometry ?OBSLOC .
?OBS om:result ?R .
?R om:value ?V ?T .
?RA strdf:hasGeometry ?RAGEO .
filter(?T contains "t = 11" && ?RAGEO contains ?OBSLOC) }
Dagstuhl SSN 26/01/2010
25. '
&
$
%
Example - Dataset III (cont’d)
ex:location2 strdf:hasTrajectory
"(x=10t and y=5t and 0<=t<=5) or
(y=5t and x=50 and 25<=y<=50)"^^strdf:SemiLinearPointSet.
Dagstuhl SSN 26/01/2010
26. '
&
$
%
Example - Dataset III (cont’d)
Metadata about geographical areas:
ex:area1 rdf:type ex:RuralArea .
ex:area1 ex:hasName "Brovallen" .
ex:area1 strdf:hasGeometry
"(-x+1.363636y<-5.272576 and y<79 and -y<-13 and
x-0.090909y<131.818133) or (y<13 and x<133 and
-x-1.5y<-128.5)"^^strdf:SemiLinearPointSet .
Dagstuhl SSN 26/01/2010
27. '
&
$
%
Queries - Dataset III
Intersection of an area with a trajectory. Which areas of Brovallen
were sensed by a moving sensor and when?
select (?TR[1,2] INTER ?GEO) as ?SENSEDAREA ?GEO[3]
where { ?SN rdf:type ssn:Sensor .
?Y rdf:type ssn:Location .
?X rdf:type ssn:SensorGrounding .
?RA rdf:type ex:RuralArea .
?SN ssn:supports ?X .
?X ssn:hasLocation ?Y .
?Y strdf:hasTrajectory ?TR .
?RA ex:hasName "Brovallen" .
?RA strdf:hasGeometry ?GEO .
filter(?TR[1,2] overlap ?GEO) }
Dagstuhl SSN 26/01/2010
28. '
&
$
%
Answer
?SENSEDAREA ?T1
"((y=0.5x and 0<=x<=50) or
(x=50 and 25<=y<=50)) and
((y<79 and -y<-13 and
-x+1.363636y<-5.272576 and
x-0.090909y<131.818133) or
(y<13 and x<133 and
-x-1.5y<-128.5))"
^^strdf:SemiLinearPointSet
"0 <= t <= 10"
^^strdf:SemiLinearPointSet.
Dagstuhl SSN 26/01/2010
29. '
&
$
%
One More Query
Projection and spatial function application. Find the URIs of the
sensors that are north of Brovallen.
select ?SN
where { ?SN rdf:type ssn:Sensor .
?X rdf:type ssn:SensorGrounding .
?RA rdf:type ex:RuralArea .
?SN ssn:supports ?X .
?X ssn:hasLocation ?Y .
?Y strdf:hasGeometry ?SN_LOC .
?RA ex:hasName "Brovallen".
?RA strdf:hasGeometry ?GEO .
filter(MAX(?GEO[2])<MIN(?SN_LOC[2])) }
Dagstuhl SSN 26/01/2010
31. '
&
$
%
What Else is There in stSPARQL Syntax?
• k-ary spatial terms
“(x ≥ 1 ∧ x = y) ∨ y = 7”
?GEO ∩ “(x ≥ 1 ∧ x = y) ∨ y = 7”
BD(?GEO ∩ “(x ≥ 1 ∧ x = y) ∨ y = 7”)
“(x ≥ 1 ∧ x ≤ 10 ∧ y ≥ 0 ∧ x = z)”[1, 2] ∩ “(z ≥ 0 ∧ z ≤ 10)”
• Metric spatial terms
AREA(“(x ≥ 1 ∧ x ≤ 10 ∧ y ≥ 0 ∧ x = y)”)
MIN(“(x ≥ 1 ∧ x ≤ 10 ∧ y ≥ 0 ∧ x = y)”[1])
• Spatial and temporal selection predicates
Dagstuhl SSN 26/01/2010
32. '
&
$
%
stSPARQL Semantics
• Extension of the SPARQL semantics of (Perez et al., 2006).
– Extend the concept of mapping.
– The semantics of AND, OPT, UNION remain the same.
– We need to define carefully the evaluation of spatial terms
and the semantics of spatial and temporal filters.
Dagstuhl SSN 26/01/2010
37. '
&
$
%
Future Work
• Complete the theory (complexity results).
• Complete the implementation (see demo for what works now).
• Integrate with the rest of SemsorGrid4Env architecture
components.
• Incomplete information (use the blank nodes for a bit more
than what they are currently used).
• OWL 2 and constraints?
Dagstuhl SSN 26/01/2010