SlideShare a Scribd company logo
1 of 18
Sensor Data Management @ EPFL


          Karl Aberer
Overview

  Sensor Data Management
  –    Global Sensor Networks
  –    Swiss Experiment
  –    Sensor Metadata Management
  –    Time Series compression and retrieval
  –    Sensor data analysis and quality
  –    Economics-based resource allocation in distributed clouds
  –    Cloud-based time series management system
  Web Data Management
  –  Large-scale Semantic Data Integration
  –  Web Stream Data Analysis (Twitter)
Global Global Sensor Networks
                              Sensor Networks (GSN)

Integrates different sensor networks               GSN:
– Different abstractions, hard to share   Reference Implementation
                                              Integrity Service
– Isolated networks, hard to republish
                                               Access Control

GSN server:                                GSN/Web/Web-Services
                                            Notification Manager
– Goal: Publishing streams generated          Query Processor
  by sensor networks                          Query Repository
– Storage, archive                            Storage Manager
                                            Virtual Sensor Manager
– Access to sensor network hardware
                                             Input Stream Manager
– Easy setup, easy to change               Stream Quality Manager

Virtual Sensor:
                                            Life Cycle Manager


– Processing, filtering, aggregation       Pool Of Sensing Devices

– Functional/non-functional properties
– Described in a XML file
Current GSN deployments
        GSN Deployments
Swiss Experiment Infrastructure
!"# "$%&'( )*'+*,'-
 !"#$%&&%'




                               (
                               ()%"*%'




                    $+!,)"%'
Sensor Metadata Management
                                               Metadata

       Effective Metadata Management in Federated Sensor
       Networks
       !"#$%&'()&*+,$-&*()&.+/+,,-012&3()&*+45"&*()&67",",&8()&9+:"2&;()&.+/+-1+$$1#&<()&="5$-$%&>()&&&
       41&+//"+,&-$&*?<@ ABCB(




   !"#$%&'(&)*%+,-,%&-*',./%"01$%.'-,+,-,

+2&-*234-'+%5)2(/%,4-).,-'+%.'-,+,-,%6'('*,-2)(
                                                                                           &(,:&9)-& );%"01%%%%%%%
          ,+7,(8'+%.'-,+,-,%&',*89                                                            ;)*%"<2&&=>
Time Series Compression and Retrieval

  A model M describes the dependency between two sets of variables X and Y
  Models may capture data correlations, derive unknown values, quantify and
    correct measurement errors
    –  They are particularly useful for data compression, data completion and data cleaning


  Our work is on
    –  Deriving lower bounds on the achievable compression ratio for a time series
    –  Define a suitable model-based storage and indexing scheme for fast
       retrieval
    –  Defining innovative models for data cleaning and data quality estimation


  Publications: ICDE’10, MDM’11, VLDB’11 (under preparation)
Parameter Compression
Data Compression
  Towards Multi-Model Approximation of Time-Series
              Thanasis Papaioannou, Mehdi Riahi, Karl Aberer [MDM 2011] (under review)
Probabilistic Data Generation
Sensor Context Extraction
  SeMiTri: A Framework for Semantic Annotation of Heterogeneous Trajectories
                         Z. Yan, D. Chakraborty, C. Parent, S. Spaccapietra, K. Aberer [EDBT 2011]

 Objec&ve:	
  	
  A	
  Middleware	
  for	
  automa&cally	
  annota&ng	
  trajectories	
  of	
  different	
  types	
  
                                      of	
  moving	
  objects	
  (cars,	
  people)	
  
                                                                                                                                   Spa&al	
  join	
  (region)	
  
                        bus            metro            walking
  Semantic
  trajectory     home         office           market             home



           Semantic Annotation Middleware
                                                                                            Map-­‐matching	
  (road	
  network)	
  

                                                        Hidden
      Spatial               Map
                                                        Markov
       Join               Matching
                                                        Model




                                                                                            HMM	
  (point	
  of	
  Interest)	
  
       region            road network             point of interest


                  e1 e2 e3              e4       e5       e6       e7
    GPS
  episodes
Trusted Privacy-preserving Sensing
Economic Cloud Resource Management

  Objective: high availability and low response-time in a cost-effective w
   ay in data clouds
    –  Hardware (correlated) failures, highly irregular query rates, NP multi-constr
       ained global optimization problem!
  Solution: decentralized virtual economy (‘Skute’)
    –  Partition data using consistent hashing
    –  A virtual node is responsible for a key range
    –  Virtual ring organizes virtual nodes per availability level and per application
    –  Virtual nodes act as economic agents and independently migrate, replicate
       or delete themselves
    –  Skute offers differentiated availability guarantees, as well as automated an
       d balanced cloud resources elasticity
  Publications: ACDC’09, ICDE’09, SoCC’10, Cloud’10, CCGrid’11
  Springer book on “Economic Cloud Resource Management”, under prep
   aration
TimeCloud

  A Cloud System for Massive Time Series Management
    –  Web-based time series management in the cloud
             •  Storage cloud, various time-series visualization, group-based data share, …
             •  Potentially linked to third-party software, e.g. SensorMap, SwissEx Wiki
    –  Storage-and-computing platform for massive time series processing
             •  Built on Hadoop/Hbase/GSN with capability of handling data streams
             •  Very efficient model-based parallel time-series data processing

  third-parties




                                                                            data streams




                                                                                Time-series compression
                                                                Efficient data processing based on model-based views
                                                                           Distributed time-series processing
Overview

  Sensor Data Management
  –    Global Sensor Networks
  –    Swiss Experiment
  –    Sensor Metadata Management
  –    Time Series compression and retrieval
  –    Sensor data analysis and quality
  –    Economics-based resource allocation in distributed clouds
  –    Cloud-based time series management system
  Web Data Management
  –  Large-scale Semantic Data Integration
  –  Web Stream Data Analysis (Twitter)
“The Wisdom of the Network”

Problem                                     Emergent semantics
• Schema heterogeneity inherent             • Establishing semantic
problem for enterprise cooperation          interoperability as a self-organizing
networks                                    process within a community or
• Both manual and automated mapping         social network
error-prone                                 • Mappings are established in a
• Interoperability challenges evolve        localized, incremental manner
constantly
                                       •     Create mappings in a pay-as-you-go
                                             fashion
                                       •     Exploit the the knowledge available in the
                                             network:
                                               •   Available mappings in the network
                                               •   Content features
                                               •   Social structure of the network
                                               •   User feedback
                                               •   Economic incentives
                                       •      Apply probabilistic reasoning techniques to
                                             improve mapping quality
Web Data Stream Analysis

  Classifying Twitter messages
    We would like to classify tweets, containing a given keyword (e.g. “
     apple”), whether they are related to a given company or not
    Won the WePS 2010 tweet classification task
  Thank you for your attention!

  For more information please visit

                      http://lsir.epfl.ch/

More Related Content

What's hot

Phase shift keying(PSK)
Phase shift keying(PSK)Phase shift keying(PSK)
Phase shift keying(PSK)MOHAN MOHAN
 
Wsn unit-1-ppt
Wsn unit-1-pptWsn unit-1-ppt
Wsn unit-1-pptSwathi Ch
 
IOT PROTOCOLS.pptx
IOT PROTOCOLS.pptxIOT PROTOCOLS.pptx
IOT PROTOCOLS.pptxDRREC
 
LED and LASER source in optical communication
LED and LASER source in optical communicationLED and LASER source in optical communication
LED and LASER source in optical communicationbhupender rawat
 
Mimo in Wireless Communication
Mimo in Wireless CommunicationMimo in Wireless Communication
Mimo in Wireless Communicationkailash karki
 
Artificial Intelligence in Computer Networks
Artificial Intelligence in Computer NetworksArtificial Intelligence in Computer Networks
Artificial Intelligence in Computer NetworksAbdullah Khosa
 
Adhoc wireless networks and its issues
Adhoc wireless networks and its issuesAdhoc wireless networks and its issues
Adhoc wireless networks and its issuesMenaga Selvaraj
 
Wireless Networks Introduction
Wireless Networks IntroductionWireless Networks Introduction
Wireless Networks Introductionramalakshmi54
 
Optical Burst Switching
Optical Burst SwitchingOptical Burst Switching
Optical Burst SwitchingJYoTHiSH o.s
 
Interference and system capacity
Interference and system capacityInterference and system capacity
Interference and system capacityAJAL A J
 
Chapter_1.pptx
Chapter_1.pptxChapter_1.pptx
Chapter_1.pptxAadiSoni3
 
Different Types of Backhaul
Different Types of BackhaulDifferent Types of Backhaul
Different Types of Backhaul3G4G
 
Wireless electricity
Wireless electricityWireless electricity
Wireless electricitygopal sai
 

What's hot (20)

Phase shift keying(PSK)
Phase shift keying(PSK)Phase shift keying(PSK)
Phase shift keying(PSK)
 
WSN IN IOT
WSN IN IOTWSN IN IOT
WSN IN IOT
 
Wsn unit-1-ppt
Wsn unit-1-pptWsn unit-1-ppt
Wsn unit-1-ppt
 
IOT PROTOCOLS.pptx
IOT PROTOCOLS.pptxIOT PROTOCOLS.pptx
IOT PROTOCOLS.pptx
 
LED and LASER source in optical communication
LED and LASER source in optical communicationLED and LASER source in optical communication
LED and LASER source in optical communication
 
Mimo in Wireless Communication
Mimo in Wireless CommunicationMimo in Wireless Communication
Mimo in Wireless Communication
 
Global state routing
Global state routingGlobal state routing
Global state routing
 
Artificial Intelligence in Computer Networks
Artificial Intelligence in Computer NetworksArtificial Intelligence in Computer Networks
Artificial Intelligence in Computer Networks
 
Adhoc wireless networks and its issues
Adhoc wireless networks and its issuesAdhoc wireless networks and its issues
Adhoc wireless networks and its issues
 
Wireless Networks Introduction
Wireless Networks IntroductionWireless Networks Introduction
Wireless Networks Introduction
 
Optical Burst Switching
Optical Burst SwitchingOptical Burst Switching
Optical Burst Switching
 
Cellular communication
Cellular communicationCellular communication
Cellular communication
 
Interference and system capacity
Interference and system capacityInterference and system capacity
Interference and system capacity
 
IoT Security
IoT SecurityIoT Security
IoT Security
 
Chapter_1.pptx
Chapter_1.pptxChapter_1.pptx
Chapter_1.pptx
 
Beamforming antennas (1)
Beamforming antennas (1)Beamforming antennas (1)
Beamforming antennas (1)
 
Different Types of Backhaul
Different Types of BackhaulDifferent Types of Backhaul
Different Types of Backhaul
 
TMS320C5x
TMS320C5xTMS320C5x
TMS320C5x
 
Cordless Technology
Cordless TechnologyCordless Technology
Cordless Technology
 
Wireless electricity
Wireless electricityWireless electricity
Wireless electricity
 

Similar to Sensor Data Management

Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingJithin Parakka
 
KAIST 전산학과 iDBLab 소개 20130319-발표용
KAIST 전산학과 iDBLab 소개 20130319-발표용KAIST 전산학과 iDBLab 소개 20130319-발표용
KAIST 전산학과 iDBLab 소개 20130319-발표용Taehun Kim, Ph.D
 
Distributed Shared Memory on Ericsson Labs
Distributed Shared Memory on Ericsson LabsDistributed Shared Memory on Ericsson Labs
Distributed Shared Memory on Ericsson LabsEricsson Labs
 
Application architecture for cloud
Application architecture for cloudApplication architecture for cloud
Application architecture for cloudMarco Parenzan
 
oneM2M - Management, Abstraction and Semantics
oneM2M - Management, Abstraction and SemanticsoneM2M - Management, Abstraction and Semantics
oneM2M - Management, Abstraction and SemanticsoneM2M
 
Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Mindtree Ltd.
 
Relate: Architecture, Systems and Tools for Relative Positioning
Relate: Architecture, Systems and Tools for Relative PositioningRelate: Architecture, Systems and Tools for Relative Positioning
Relate: Architecture, Systems and Tools for Relative PositioningTill Riedel
 
10 - Architetture Software - More architectural styles
10 - Architetture Software - More architectural styles10 - Architetture Software - More architectural styles
10 - Architetture Software - More architectural stylesMajong DevJfu
 
Cloud Computing : Security and Forensics
Cloud Computing : Security and ForensicsCloud Computing : Security and Forensics
Cloud Computing : Security and ForensicsGovind Maheswaran
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...SL Corporation
 
Introduction to Gruter and Gruter's BigData Platform
Introduction to Gruter and Gruter's BigData PlatformIntroduction to Gruter and Gruter's BigData Platform
Introduction to Gruter and Gruter's BigData PlatformGruter
 
OSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOpenStorageSummit
 
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...netvis
 
CouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataCouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataDebajani Mohanty
 
EvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformEvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformSergei Dolukhanov
 
EvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformEvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformSergei Dolukhanov
 
Cassandra framework a service oriented distributed multimedia
Cassandra framework  a service oriented distributed multimediaCassandra framework  a service oriented distributed multimedia
Cassandra framework a service oriented distributed multimediaJoão Gabriel Lima
 

Similar to Sensor Data Management (20)

Big data and cloud
Big data and cloudBig data and cloud
Big data and cloud
 
Networked 3-D Virtual Collaboration in Science and Education: Towards 'Web 3....
Networked 3-D Virtual Collaboration in Science and Education: Towards 'Web 3....Networked 3-D Virtual Collaboration in Science and Education: Towards 'Web 3....
Networked 3-D Virtual Collaboration in Science and Education: Towards 'Web 3....
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
KAIST 전산학과 iDBLab 소개 20130319-발표용
KAIST 전산학과 iDBLab 소개 20130319-발표용KAIST 전산학과 iDBLab 소개 20130319-발표용
KAIST 전산학과 iDBLab 소개 20130319-발표용
 
Distributed Shared Memory on Ericsson Labs
Distributed Shared Memory on Ericsson LabsDistributed Shared Memory on Ericsson Labs
Distributed Shared Memory on Ericsson Labs
 
Application architecture for cloud
Application architecture for cloudApplication architecture for cloud
Application architecture for cloud
 
oneM2M - Management, Abstraction and Semantics
oneM2M - Management, Abstraction and SemanticsoneM2M - Management, Abstraction and Semantics
oneM2M - Management, Abstraction and Semantics
 
Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
 
Relate: Architecture, Systems and Tools for Relative Positioning
Relate: Architecture, Systems and Tools for Relative PositioningRelate: Architecture, Systems and Tools for Relative Positioning
Relate: Architecture, Systems and Tools for Relative Positioning
 
10 - Architetture Software - More architectural styles
10 - Architetture Software - More architectural styles10 - Architetture Software - More architectural styles
10 - Architetture Software - More architectural styles
 
Cloud Computing : Security and Forensics
Cloud Computing : Security and ForensicsCloud Computing : Security and Forensics
Cloud Computing : Security and Forensics
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
 
Introduction to Gruter and Gruter's BigData Platform
Introduction to Gruter and Gruter's BigData PlatformIntroduction to Gruter and Gruter's BigData Platform
Introduction to Gruter and Gruter's BigData Platform
 
OSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal Stern
 
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
 
CouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataCouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big Data
 
EvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformEvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics Platform
 
EvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics PlatformEvoApp - Bermuda Real-Time Analytics Platform
EvoApp - Bermuda Real-Time Analytics Platform
 
Cassandra framework a service oriented distributed multimedia
Cassandra framework  a service oriented distributed multimediaCassandra framework  a service oriented distributed multimedia
Cassandra framework a service oriented distributed multimedia
 

More from PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 

Recently uploaded

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Recently uploaded (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Sensor Data Management

  • 1. Sensor Data Management @ EPFL Karl Aberer
  • 2. Overview   Sensor Data Management –  Global Sensor Networks –  Swiss Experiment –  Sensor Metadata Management –  Time Series compression and retrieval –  Sensor data analysis and quality –  Economics-based resource allocation in distributed clouds –  Cloud-based time series management system   Web Data Management –  Large-scale Semantic Data Integration –  Web Stream Data Analysis (Twitter)
  • 3. Global Global Sensor Networks Sensor Networks (GSN) Integrates different sensor networks GSN: – Different abstractions, hard to share Reference Implementation Integrity Service – Isolated networks, hard to republish Access Control GSN server: GSN/Web/Web-Services Notification Manager – Goal: Publishing streams generated Query Processor by sensor networks Query Repository – Storage, archive Storage Manager Virtual Sensor Manager – Access to sensor network hardware Input Stream Manager – Easy setup, easy to change Stream Quality Manager Virtual Sensor: Life Cycle Manager – Processing, filtering, aggregation Pool Of Sensing Devices – Functional/non-functional properties – Described in a XML file
  • 4. Current GSN deployments GSN Deployments
  • 5. Swiss Experiment Infrastructure !"# "$%&'( )*'+*,'- !"#$%&&%' ( ()%"*%' $+!,)"%'
  • 6. Sensor Metadata Management Metadata Effective Metadata Management in Federated Sensor Networks !"#$%&'()&*+,$-&*()&.+/+,,-012&3()&*+45"&*()&67",",&8()&9+:"2&;()&.+/+-1+$$1#&<()&="5$-$%&>()&&& 41&+//"+,&-$&*?<@ ABCB( !"#$%&'(&)*%+,-,%&-*',./%"01$%.'-,+,-, +2&-*234-'+%5)2(/%,4-).,-'+%.'-,+,-,%6'('*,-2)( &(,:&9)-& );%"01%%%%%%% ,+7,(8'+%.'-,+,-,%&',*89 ;)*%"<2&&=>
  • 7. Time Series Compression and Retrieval   A model M describes the dependency between two sets of variables X and Y   Models may capture data correlations, derive unknown values, quantify and correct measurement errors –  They are particularly useful for data compression, data completion and data cleaning   Our work is on –  Deriving lower bounds on the achievable compression ratio for a time series –  Define a suitable model-based storage and indexing scheme for fast retrieval –  Defining innovative models for data cleaning and data quality estimation   Publications: ICDE’10, MDM’11, VLDB’11 (under preparation)
  • 9. Data Compression   Towards Multi-Model Approximation of Time-Series Thanasis Papaioannou, Mehdi Riahi, Karl Aberer [MDM 2011] (under review)
  • 11. Sensor Context Extraction   SeMiTri: A Framework for Semantic Annotation of Heterogeneous Trajectories Z. Yan, D. Chakraborty, C. Parent, S. Spaccapietra, K. Aberer [EDBT 2011] Objec&ve:    A  Middleware  for  automa&cally  annota&ng  trajectories  of  different  types   of  moving  objects  (cars,  people)   Spa&al  join  (region)   bus metro walking Semantic trajectory home office market home Semantic Annotation Middleware Map-­‐matching  (road  network)   Hidden Spatial Map Markov Join Matching Model HMM  (point  of  Interest)   region road network point of interest e1 e2 e3 e4 e5 e6 e7 GPS episodes
  • 13. Economic Cloud Resource Management   Objective: high availability and low response-time in a cost-effective w ay in data clouds –  Hardware (correlated) failures, highly irregular query rates, NP multi-constr ained global optimization problem!   Solution: decentralized virtual economy (‘Skute’) –  Partition data using consistent hashing –  A virtual node is responsible for a key range –  Virtual ring organizes virtual nodes per availability level and per application –  Virtual nodes act as economic agents and independently migrate, replicate or delete themselves –  Skute offers differentiated availability guarantees, as well as automated an d balanced cloud resources elasticity   Publications: ACDC’09, ICDE’09, SoCC’10, Cloud’10, CCGrid’11   Springer book on “Economic Cloud Resource Management”, under prep aration
  • 14. TimeCloud   A Cloud System for Massive Time Series Management –  Web-based time series management in the cloud •  Storage cloud, various time-series visualization, group-based data share, … •  Potentially linked to third-party software, e.g. SensorMap, SwissEx Wiki –  Storage-and-computing platform for massive time series processing •  Built on Hadoop/Hbase/GSN with capability of handling data streams •  Very efficient model-based parallel time-series data processing third-parties data streams Time-series compression Efficient data processing based on model-based views Distributed time-series processing
  • 15. Overview   Sensor Data Management –  Global Sensor Networks –  Swiss Experiment –  Sensor Metadata Management –  Time Series compression and retrieval –  Sensor data analysis and quality –  Economics-based resource allocation in distributed clouds –  Cloud-based time series management system   Web Data Management –  Large-scale Semantic Data Integration –  Web Stream Data Analysis (Twitter)
  • 16. “The Wisdom of the Network” Problem Emergent semantics • Schema heterogeneity inherent • Establishing semantic problem for enterprise cooperation interoperability as a self-organizing networks process within a community or • Both manual and automated mapping social network error-prone • Mappings are established in a • Interoperability challenges evolve localized, incremental manner constantly •  Create mappings in a pay-as-you-go fashion •  Exploit the the knowledge available in the network: •  Available mappings in the network •  Content features •  Social structure of the network •  User feedback •  Economic incentives •  Apply probabilistic reasoning techniques to improve mapping quality
  • 17. Web Data Stream Analysis   Classifying Twitter messages   We would like to classify tweets, containing a given keyword (e.g. “ apple”), whether they are related to a given company or not   Won the WePS 2010 tweet classification task
  • 18.   Thank you for your attention!   For more information please visit http://lsir.epfl.ch/