1. Toward Semantic Sensor
Data Archives on the Web
Jean-Paul Calbimonte – Karl Aberer
LSIR EPFL
MEPDAW, ESWC
Heraklion, Greece. June 2016
@jpcik
2. Sensor Data on the Web
2
http://mesowest.utah.edu/
http://earthquake.usgs.gov/earthquakes/feed/v1.0/
http://swiss-experiment.ch
• Monitoring
• Alerts
• Notifications
• Hourly/daily updates
• Myriad of Formats
• Ad-hoc access points
• Informal description
• Convention-semantics
• Uneven use of standards
• Manual exploration
3. Sensor Archives: Challenges
3
Discoverability:
• Subject of sensing identified and searchable.
• Explicit semantics on the sensor metadata
• Common understanding of the objects of sensing
• Agreed models e.g. ontologies
Storage:
• Persistence not always required.
• Sensor data is (sometimes) consumed live
• Aggregations stored permanently.
• Different archival options available
• Reduce volume as much as possible, using compressed formats
• Querying and transactional requirements often less critical
• Silos of sensor data in the form of compressed files.
• Replication or backup
4. Sensor Archives: Challenges
4
Reusability:
• Reusing the data for other purposes
• Compare data from another locations
• Use for calibration purposes
• Finding correlations.
• Historical and batch analysis
• Benchmarking
• Training datasets for mining algorithms.
• Feed numerical models
Accessibility:
• Data access through APIs
• Consumption from people/software applications.
• De-referenceable URIs
• Simple but effective retrieval of sensor data.
• SPARQL -> selecting relevant parts of the data
• Complex queries not always required
• Simple time interval and filters just enough
Interoperability &
Standardization.
• RDF/SPARQ: building block for
publishing data,
• Specific ontologies and vocabularies,
such as the SSN ontology
• Represent both sensor metadata,
and observations.
5. Sensor Data & Linked Data
5
Zip Files
Number of Triples
Example: Nevada dataset
-7.86GB in n-triples format
-248MB zipped
An example: Linked Sensor Data
http://wiki.knoesis.org/index.php/LinkedSensorData
6. Sensor Data & Linked Data
6
<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#MeasureData> .
<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#floatValue>
"30.0"^^<http://www.w3.org/2001/XMLSchema#float> .
<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#uom>
<http://knoesis.wright.edu/ssw/ont/weather.owl#centimeters> .
<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://knoesis.wright.edu/ssw/ont/weather.owl#PrecipitationObservation> .
<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#observedProperty>
<http://knoesis.wright.edu/ssw/ont/weather.owl#_Precipitation> .
<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#procedure>
<http://knoesis.wright.edu/ssw/System_4UT01> .
<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00>
<http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#samplingTime>
<http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00> .
<http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.w3.org/2006/time#Instant> .
<http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00>
<http://www.w3.org/2006/time#inXSDDateTime>
"2003-03-31T05:10:00-07:00^^http://www.w3.org/2001/XMLSchema#dateTime" .
What do we get in these datasets?
Nice triples
Do we care about all the rest?
What is measured?
Measurement
Unit
Sensor
When is it measured
7. Semantic Sensor Data Archives
7
How to address these challenges?
Discoverability
Reusability
Accessibility
Interoperability & Standardization
Storage
How to use existing Semantic Web technologies appropriately?
Need for new standards and techniques?
8. Localization: GNSS fusioned with odometry
GPRS
• packet parser
• system logging
• database server
• GPS interpolation
• advanced filtering
• fault detection
• system health monitor
• automatic reporting
10busesinLausanne
CO, NO2, O3, CO2,
UFP, temperature, humidity
OpenSense2 @ Lausanne
8
9. Reference
station
Crowd sensing
Public
transportation
Raw Data
Acquisition
Air Pollutants
Time Series
Temporal
Spatial
Aggregations
Pollution Maps Pollution Models
Air Quality
recommendation
s
Health Studies
Air Quality
Products &
Applications
From Sensing to Actionable Data
9
Running example for discussing a Semantic Sensor Data Archive
10. An Architecture for a Sensor Archive
10
Disclaimer: Work in Progress
• RDF for Sensor and Catalog metadata
• Native format for Sensor observations (time series)
• CSV archive for sensor observations
• RDF-unpack of CSV archived data
• Mappings for Native format-to-RDF live transofrmation
Data characteristics
11. Sensor data characteristics
11
Sensor data regularity
• Raw sensor data typically collected as time series
• Very regular structure.
• Patterns can be exploited
E.g. mobile NO2 sensor readings
29-02-2016T16:41:24,47,369,46.52104,6.63579
29-02-2016T16:41:34,47,358,46.52344,6.63595
29-02-2016T16:41:44,47,354,46.52632,6.63634
29-02-2016T16:41:54,47,355,46.52684,6.63729
...
Sensor data order
• Order of sensor data is crucial
• Time is the key attribute for establishing an order among the data items.
• Important for indexing
• Enables efficient time-based selection, filtering and windowing
Timestamp Sensor Observed
Value
Coordinates
17. Sensor Observations
17
:no2obs1 a :NO2Observation ;
ssn:observedProperty :NO2 ;
ssn:featureOfInterest aq:AirMedium ;
ssn:observedBy :NO2SensorBox ;
ssn:observationResult :no2obs1result ;
ssn:observationResultTime :instant_20160331232000 .
:no2obs1result a :NO2ObservationValue ;
qu:numericalValue "345.00"^^xsd:float ;
qu:unit :ppm .
:instant_20160331232000 a time:Instant ;
time:inXSDDateTime "2016-03-31T23:20:00"^^xsd:datetime .
Type of Measurement
Sensor
Observed Value
Unit
Generated only on demand through mappings
18. R2RML Mappings
18
:ObsValueMap
rr:subjectMap [
rr:template "http://opensense.epfl.ch/data/ObsResult_NO2_{sensor}_{time}"];
rr:predicateObjectMap [
rr:predicate qu:numericalValue;
rr:objectMap [ rr:column "no2"; rr:datatype xsd:float; ]];
rr:predicateObjectMap [
rr:predicate obs:uom;
rr:objectMap [ rr:parentTriplesMap :UnitMap; ]].
:ObservationMap
rr:subjectMap [
rr:template "http://opensense.epfl.ch/data/Obs_NO2_{sensor}_{time}"];
rr:predicateObjectMap [
rr:predicate ssn:observedProperty;
rr:objectMap [ rr:constant opensense:NO2]];
URI of subject
URI of predicate
Object: colum name
Column names in a template
Can be used for mapping both databases and CSVs