How to make data more usable on the Internet of Things
1. 1
How to make data more usable on the
Internet of Things
Payam Barnaghi
Centre for Communication Systems Research (CCSR)
Faculty of Engineering and Physical Sciences
University of Surrey
March 2013
7. 7
Wireless Sensor Networks (WSN)
Sink
node Gateway
Core network
e.g. InternetGateway
End-user
Computer services
- The networks typically run Low Power Devices
- Consist of one or more sensors, could be different type of sensors (or actuators)
8. 8
Key characteristics of IoT devices
−Often inexpensive sensors (actuators) equipped with a radio
transceiver for various applications, typically low data rate ~
10-250 kbps.
−Deployed in large numbers
−The sensors should coordinate to perform the desired task.
−The acquired information (periodic or event-based) is
reported back to the information processing centre (or
sometimes in-network processing is required)
−Solutions are application-dependent.
8
9. 9
Beyond conventional sensors
− Human as a sensor (citizen sensors)
− e.g. tweeting real world data and/or events
− Virtual (software) sensors
− e.g. Software agents/services generating/representing
data
Road block, A3
Road block, A3
Suggest a different route
14. 14
Making Sense of Data
In the next few years, sensor networks will produce
10-20 time the amount of data generated by social
media. (source: GigaOmni Media)
15. 15
Things, Data, and lots of it
image courtesy: Smarter Data - I.03_C by Gwen Vanhee
16. 16
Big Data and IoT
− "Big data" is a term applied to data sets whose size is beyond the
ability of commonly used software tools to capture, manage, and
process the data within a tolerable elapsed time. Big data sizes
are a constantly moving target, as of 2012 ranging from a few
dozen terabytes to many petabytes of data in a single data set.”
(wikipedia)
− Every day, we create 2.5 quintillion bytes of data — so much that
90% of the data in the world today has been created in the last
two years alone. (source IBM)
17. 17
The seduction of data
− Turn 12 terabytes of Tweets created each day into sentiment
analysis related to different events/occurrences or relate them to
products and services.
− Convert (billions of) smart meter readings to better predict and
balance power consumption.
− Analyze thousands of traffic, pollution, weather, congestion, public
transport and event sensory data to provide better traffic
management.
− Monitor patients, elderly care and much more…
Adapted from: What is Bog Data?, IBM
19. 19
“Raw data is both an oxymoron and
bad data”
Geoff Bowker, 2005
Source: Kate Crawford, "Algorithmic Illusions: Hidden Biases of Big Data", Strata 2013.
20. 20
IoT Data in the Cloud
Image courtesy: http://images.mathrubhumi.com
http://www.anacostiaws.org/userfiles/image/Blog-Photos/river2.jpg
22. 22
Change in communication paradigm
Sink
node Gateway
Core network
e.g. Internet End-user
Data
Sender
Data
Receiver
A sample data communication in conventional networks
A sample data communication in WSN
Fire! Some bits
01100011100
23. 23
− Collaboration and in-network processing
− In some applications a single sensor node is not able to handle
the given task or provide the requested information.
− Instead of sending the information form various source to an
external network/node, the information can be processed in
the network itself.
− e.g. data aggregation, summarisation and then propagating the
processed data with reduced size (hence improving energy
efficiency by reducing the amount of data to be transmitted).
− Data-centric
− Conventional networks often focus on sending data between
two specific nodes each equipped with an address.
− Here what is important is data and the observations and
measurements not the node that provides it.
Required mechanisms
24. “People want answers, not numbers”
(Steven Glaser, UC Berkley)
Sink
node Gateway
Core network
e.g. Internet
What is the temperature at home?Freezing!
25. 25
IoT Data alone is not enough
− Domain knowledge
− Machine interpretable meta data
− Delivery, sharing and representation services
− Query, discovery, aggregation services
− Publish, subscribe, notification, and access
interfaces/services
27. 27
IoT Data Challenges
− Discovery: finding appropriate device and data sources
− Access: Availability and (open) access to IoT resources and
data
− Search: querying for data
− Integration: dealing with heterogeneous device, networks
and data
− Interpretation: translating data to knowledge usable by
people and applications
− Scalability: dealing with large number of devices and
myriad of data and computational complexity of
interpreting the data.
28. 28
Interpretation of data
− A primary goal of interconnecting devices and
collecting/processing data from them is to create
situation awareness and enable applications,
machines, and human users to better understand
their surrounding environments.
− The understanding of a situation, or context,
potentially enables services and applications to
make intelligent decisions and to respond to the
dynamics of their environments.
29. 29
Observation and measurement data
Source: W3C Semantic Sensor Networks, SSN Ontology presentation, Laurent Lefort et al.
30. 30
How to say what a sensor is and
what it measures?
Sink
node
Gateway
31. 31
Data/Service description frameworks
− There are standards such as Sensor Web Enablement
(SWE) set developed by the Open Geospatial Consortium
that are widely being adopted in industry, government and
academia.
− While such frameworks provide some interoperability,
semantic technologies are increasingly seen as key enabler
for integration of IoT data and broader Web information
systems.
33. 33
W3C SSN Ontology
Ontology Link: http://www.w3.org/2005/Incubator/ssn/ssnx/ssn
M. Compton et al, "The SSN Ontology of the W3C Semantic Sensor Network Incubator Group", Journal of Web Semantics, 2012.
34. 34
34
W3C SSN Ontology
makes observations
of this type
Where it is
What it
measures
units
SSN-XG ontologies
SSN-XG annotations
SSN-XG Ontology Scope
35. 35
Semantics and IoT data
− Creating ontologies and defining data models is not enough
− tools to create and annotate data
− data handling components
− Complex models and ontologies look good, but
− design lightweight versions for constrained environments
− think of practical issues
− make it as compatible as possible and/or link it to the other
existing ontologies
− Domain knowledge and instances
− Common terms and vocabularies
− Location, unit of measurement, type, theme, …
− Link it to other resources
− Linked-data
− URIs and naming
36. 36
Semantics and sensor data
Source: W. Wang, P. Barnaghi, "Semantic Annotation and Reasoning for Sensor Data", In proceedings of the 4th European Conference on Smart
Sensing and Context (EuroSSC2009), 2009.
37. 37
Semantics and Linked-data
− The principles in designing the linked data are
defined as:
− using URI’s as names for things;
− using HTTP URI’s to enable people to look up those
names;
− provide useful RDF information related to URI’s that are
looked up by machine or people;
− including RDF statements that link to other URI’s to
enable discovery of other related concepts of the Web of
Data;
39. 39
Linked Open Data
Collectively, the 203 data sets consist of over 25 billion RDF triples,
which are interlinked by around 395 million RDF links (September
2010).
41. 41
Myth and reality
− #1: If we create an Ontology our data is interoperable
− Reality: there are/could be a number of ontologies for a domain
− Ontology mapping
− Reference ontologies
− Standardisation efforts
− #2: Semantic data will make my data machine-understandable
and my system will be intelligent.
− Reality: it is still meta-data, machines don’t understand it but can interpret it. It
still does need intelligent processing, reasoning mechanism to process and
interpret the data.
− #3: It’s a Hype! Ontologies and semantic data are too much
overhead; we deal with tiny devices in IoT.
− Reality: Ontologies are a way to share and agree on a common vocabulary and
knowledge; at the same time there are machine-interpretable and represented in
interoperable and re-usable forms;
− You don’t necessarily need to add semantic metadata in the source- it could be
added to the data at a later stage (e.g. in a gateway);
− Legacy applications can ignore it or to be extended to work with it.
43. 43
43
Symbolic Aggregate Approximation (SAX)
Variable String Length and Vocabulary size.
Length: 10, VocSize: 10 Length: 10, VocSize: 4
“gijigdbabd” “cdddcbaaab”
Green Curve: consists of 100 Samples, Blue Curve: SAX
44. 44
SAX representation
SAX Pattern (blue) with word length of 20 and a vocabulary of 10 symbols
over the original sensor time-series data (green)
P. Barnaghi, F. Ganz, C. Henson, A. Sheth, "Computing Perception from Sensor Data",
in Proc. of the IEEE Sensors 2012, Oct. 2012.
fggfffhfffffgjhghfff
jfhiggfffhfffffgjhgi
fggfffhfffffgjhghfff
45. 45
Data Processing Framework
fggfffhfffffgjhghfff dddfffffffffffddd cccddddccccdddccc aaaacccaaaaaaaaccccdddcdcdcdcddasddd
PIR Sensor Light Sensor
Temperature
Sensor
Raw sensor data
stream
Raw sensor data
stream
Raw sensor data
stream
Attendance Phone
Hot
Temperature
Cold
Temperature
Bright
Day-time
Night-time
Office room
BA0121
On going
meeting
Window has
been left open
….
Temporal data
(extracted from
SSN descriptions)
Spatial data
(extracted from
SSN descriptions)
Thematic data
(low level
abstractions)
Parsimonious
Covering Theory
Observations
Perceptions
Domain knowledge
SAX Patterns
Raw Sensor Data
(Annotated with SSN
Ontology)
…
….
Perception
Computation
High-level
Perceptions
46. 46
SensorSAX
F. Ganz, P. Barnaghi, F. Carrez, “Information Abstraction for Heterogeneous Real World Internet
Data”, Feb. 2013.
47. 47
Evaluation results of abstraction
creation
F. Ganz, P. Barnaghi, F. Carrez, “Information Abstraction for Heterogeneous Real World Internet Data”, Feb. 2013.
48. 48
Data size reduction
F. Ganz, P. Barnaghi, F. Carrez, “Information Abstraction for Heterogeneous Real World Internet Data”, Feb. 2013.
49. 49
Enabling the Internet of Things
- Diversity range of applications
- Interacting with large number
of devices with various types
-Multiple heterogeneous
networks
-Deluge of data
-Processing and interpretation of
the IoT data
50. 50
Challenges and opportunities
− Providing infrastructure
− Publishing, sharing, and access solutions on a global scale
− Indexing and discovery (data and resources)
− Aggregation and fusion
− Trust, privacy and security
− Data mining and creating actionable knowledge
− Integration into services and applications in e-health, the public
sector, retail, manufacturing and personalized apps.
− Mobile apps, location-based services, monitoring control etc.
− New business models
51. 51
Events
Semantic Interop event, European Wireless Conference,
Guildford, April 2013.
http://www.probe-it.eu/?p=1206
Tutorial at WIMS'13: Data Processing and Semantics for
Advanced Internet of Things (IoT) Applications:
modeling, annotation, integration, and perception, P.
Anantharam, P. Barnaghi, A. Sheth,
http://aida.ii.uam.es/wims13/keynotes.php
Dagstuhl seminar on Cyber-Physical-Social Computing, Sept.
30- Oct. 04, 2013, Organizers: Payam Barnaghi, Ramesh
Jain, Amit Sheth, Steffen Staab, Markus Strohmaier.
53. 53
Payam Barnaghi
Centre for Communication Systems Research
Faculty of Engineering and Physical Sciences
University of Surrey
p.barnahgi@surrey.ac.uk