SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Model-Based Similarity
Measure in TimeCloud

   Thanh-Nguyen Ngo
    Hoyoung Jeung
      Karl Aberer

     LSIR – IC – EPFL


     February 2012
Ouline




         Motivation
         Model-Based Time-Series
         Model-Based Similarity Measure
         kNN Processing
         Experiments
         Conclusion
Motivation




      The demand for storing and processing massive time-series in
      the cloud is growing rapidly
      Measuring a similarity is a fundamental operation in a wide
      range of applications that process temporally ordered data
      Computing similar time-series over a large volume of data still
      remains as a difficult problem
Model-Based Time-Series




   Definition (Time-Series)
   A time-series t of length n is a temporally ordered sequence
   t = [t1 , . . . , tn ] where point in time i is mapped to a d-dimensional
   attribute vector ti = (ti1 , . . . , tid ) of values tij with j ∈ {1, . . . , d}.
   A time-series is called univariate for d = 1 and multivariate for
   d > 1.
Model-Based Time-Series



   Definition (Common Points)
   Two points of two time-series are called common if they occur at
   the same time.

   Definition (Common Interval)
   The common interval of two segments or two time-series is the
   greatest interval [a, b] such that time a and b belong to both
   segments or time series. Two segments limited by the common
   interval are called common segments.
Model-Based Similarity Measure




   Definition (Euclidean Distance)
   The Euclidean distance between two time-series is also the
   Euclidean distance of their common segments s = [s1 , . . . , sn ] and
   t = [t1 , . . . , tn ] of length n, and it is defined as:

                                         n
                       Eucl(s, t) =           (si − ti )2
                                        i=1
Model-Based Similarity Measure




   Definition (Maximum Error Bound of Time-Series)
   Given a time-series t = [t1 , . . . , tn ] and its representation
   t = [t1 , . . . , tn ] in its model. The maximum error bound of t over
   its model is a value meb(t) such that:

                     |ti − ti | ≤ meb(t),     ∀i = 1..n
Model-Based Similarity Measure




   Theorem
   Given two time-series s, t and their representations s , t in their
   models. Assume the common segments of s and t have n time
   series points. Then,
                                         √
            |Eucl(s, t) − Eucl(s , t )| ≤ n(meb(s) + meb(t))
kNN Procesing - The Filter Stage




   Theorem
   Let ti and q be representations of ti and q in their models
   respectively. Let di be the distance between ti and q with the
   maximum error ei . Let ai = di − ei and bi = di + ei . Without loss
   of generality, assume b1 ≤ . . . ≤ bn . The candidate set
   S = {ti |ai ≤ bk } contains k nearest time-series of q and is
   minimal.
kNN Procesing - The Refinement Stage




  Theorem
  Let ti and q be representations of ti and q in their models
  respectively. Let di be the distance between ti and q with the
  maximum error ei . Let ai = di − ei and bi = di + ei . Without loss
  of generality, assume a1 ≤ . . . ≤ am . The set
  R = {ti |bi ≤ am−k+1 } is a subset of the result set.
Experiments




      2.4GHz Intel Core2 Quad CPU
      Java implementation, Ubuntu 10.10
      Default parameters
          length of time series: 512
          number of nearest neighbors: 10
          error ratio: 3%
          number of time series: 1, 000
Model-Based View Construction
Effect of Maximum Error Ratios
Effect of Number of Nearest Neighbors
Effect of Number of Time Series
Conclusion



      Process kNN queries based on model-based similarity
      measures
      Establish a set of theoretical foundations for approximated
      time-series data processing
      Build query processing mechanisms on the filter-and-refine
      approach
      Run more than three times faster than straightforward
      processing
      Facilitate scalability of the computation using the TimeCloud
      system
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Lecture 3 complexity
Lecture 3 complexityLecture 3 complexity
Lecture 3 complexityMadhu Niket
 
The discrete fourier transform (dsp) 4
The discrete fourier transform  (dsp) 4The discrete fourier transform  (dsp) 4
The discrete fourier transform (dsp) 4HIMANSHU DIWAKAR
 
Box-fitting algorithm presentation
Box-fitting algorithm presentationBox-fitting algorithm presentation
Box-fitting algorithm presentationRidlo Wibowo
 
Kmeans with canopy clustering
Kmeans with canopy clusteringKmeans with canopy clustering
Kmeans with canopy clusteringSeongHyun Jeong
 
Role of Tensors in Machine Learning
Role of Tensors in Machine LearningRole of Tensors in Machine Learning
Role of Tensors in Machine LearningAnima Anandkumar
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspectiveপল্লব রায়
 
Lesson 5 Nov 3
Lesson 5 Nov 3Lesson 5 Nov 3
Lesson 5 Nov 3ingroy
 
Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04Rediet Moges
 
Python-List comprehension
Python-List comprehensionPython-List comprehension
Python-List comprehensionColin Su
 
Digital Signal Processing[ECEG-3171]-Ch1_L06
Digital Signal Processing[ECEG-3171]-Ch1_L06Digital Signal Processing[ECEG-3171]-Ch1_L06
Digital Signal Processing[ECEG-3171]-Ch1_L06Rediet Moges
 
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Tomonari Masada
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learningYogendra Singh
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big DataGianvito Siciliano
 

Was ist angesagt? (19)

Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
 
Lecture 3 complexity
Lecture 3 complexityLecture 3 complexity
Lecture 3 complexity
 
The discrete fourier transform (dsp) 4
The discrete fourier transform  (dsp) 4The discrete fourier transform  (dsp) 4
The discrete fourier transform (dsp) 4
 
Box-fitting algorithm presentation
Box-fitting algorithm presentationBox-fitting algorithm presentation
Box-fitting algorithm presentation
 
Kmeans with canopy clustering
Kmeans with canopy clusteringKmeans with canopy clustering
Kmeans with canopy clustering
 
Role of Tensors in Machine Learning
Role of Tensors in Machine LearningRole of Tensors in Machine Learning
Role of Tensors in Machine Learning
 
Hprec7.3
Hprec7.3Hprec7.3
Hprec7.3
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
Lesson 5 Nov 3
Lesson 5 Nov 3Lesson 5 Nov 3
Lesson 5 Nov 3
 
Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04
 
Python-List comprehension
Python-List comprehensionPython-List comprehension
Python-List comprehension
 
Digital Signal Processing[ECEG-3171]-Ch1_L06
Digital Signal Processing[ECEG-3171]-Ch1_L06Digital Signal Processing[ECEG-3171]-Ch1_L06
Digital Signal Processing[ECEG-3171]-Ch1_L06
 
Chapter 10 ds
Chapter 10 dsChapter 10 ds
Chapter 10 ds
 
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing...
 
Absorbing Random Walk Centrality
Absorbing Random Walk CentralityAbsorbing Random Walk Centrality
Absorbing Random Walk Centrality
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
 
Ch8
Ch8Ch8
Ch8
 
Scalable k-means plus plus
Scalable k-means plus plusScalable k-means plus plus
Scalable k-means plus plus
 

Ähnlich wie Model based similarity measure in time cloud

Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUDavide Nardone
 
Lecture 1 (ADSP).pptx
Lecture 1 (ADSP).pptxLecture 1 (ADSP).pptx
Lecture 1 (ADSP).pptxHarisMasood20
 
multi threaded and distributed algorithms
multi threaded and distributed algorithms multi threaded and distributed algorithms
multi threaded and distributed algorithms Dr Shashikant Athawale
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxPJS KUMAR
 
A novel approach for high speed convolution of finite and infinite length seq...
A novel approach for high speed convolution of finite and infinite length seq...A novel approach for high speed convolution of finite and infinite length seq...
A novel approach for high speed convolution of finite and infinite length seq...eSAT Journals
 
A novel approach for high speed convolution of finite
A novel approach for high speed convolution of finiteA novel approach for high speed convolution of finite
A novel approach for high speed convolution of finiteeSAT Publishing House
 
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCES
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCESON RUN-LENGTH-CONSTRAINED BINARY SEQUENCES
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCESijitjournal
 
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...IJERA Editor
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp timeDanbi Cho
 
Computational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesComputational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesAntonis Antonopoulos
 
Data Structure & Algorithms - Mathematical
Data Structure & Algorithms - MathematicalData Structure & Algorithms - Mathematical
Data Structure & Algorithms - Mathematicalbabuk110
 
Lecture9 Signal and Systems
Lecture9 Signal and SystemsLecture9 Signal and Systems
Lecture9 Signal and Systemsbabak danyal
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical MethodsTeja Ande
 
Clock Skew Compensation Algorithm Immune to Floating-Point Precision Loss
Clock Skew Compensation Algorithm Immune to Floating-Point Precision LossClock Skew Compensation Algorithm Immune to Floating-Point Precision Loss
Clock Skew Compensation Algorithm Immune to Floating-Point Precision LossXi'an Jiaotong-Liverpool University
 
Design and Analysis of Algorithms Exam Help
Design and Analysis of Algorithms Exam HelpDesign and Analysis of Algorithms Exam Help
Design and Analysis of Algorithms Exam HelpProgramming Exam Help
 
1 Sampling and Signal Reconstruction.pdf
1 Sampling and Signal Reconstruction.pdf1 Sampling and Signal Reconstruction.pdf
1 Sampling and Signal Reconstruction.pdfMohamedshabana38
 
Case Study(Analysis of Algorithm.pdf
Case Study(Analysis of Algorithm.pdfCase Study(Analysis of Algorithm.pdf
Case Study(Analysis of Algorithm.pdfShaistaRiaz4
 

Ähnlich wie Model based similarity measure in time cloud (20)

Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
 
Lecture 1 (ADSP).pptx
Lecture 1 (ADSP).pptxLecture 1 (ADSP).pptx
Lecture 1 (ADSP).pptx
 
multi threaded and distributed algorithms
multi threaded and distributed algorithms multi threaded and distributed algorithms
multi threaded and distributed algorithms
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
A novel approach for high speed convolution of finite and infinite length seq...
A novel approach for high speed convolution of finite and infinite length seq...A novel approach for high speed convolution of finite and infinite length seq...
A novel approach for high speed convolution of finite and infinite length seq...
 
Pakdd
PakddPakdd
Pakdd
 
ch2-1
ch2-1ch2-1
ch2-1
 
A novel approach for high speed convolution of finite
A novel approach for high speed convolution of finiteA novel approach for high speed convolution of finite
A novel approach for high speed convolution of finite
 
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCES
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCESON RUN-LENGTH-CONSTRAINED BINARY SEQUENCES
ON RUN-LENGTH-CONSTRAINED BINARY SEQUENCES
 
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...
A Combination of Wavelet Artificial Neural Networks Integrated with Bootstrap...
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp time
 
Computational Complexity: Complexity Classes
Computational Complexity: Complexity ClassesComputational Complexity: Complexity Classes
Computational Complexity: Complexity Classes
 
Data Structure & Algorithms - Mathematical
Data Structure & Algorithms - MathematicalData Structure & Algorithms - Mathematical
Data Structure & Algorithms - Mathematical
 
Lecture9
Lecture9Lecture9
Lecture9
 
Lecture9 Signal and Systems
Lecture9 Signal and SystemsLecture9 Signal and Systems
Lecture9 Signal and Systems
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Clock Skew Compensation Algorithm Immune to Floating-Point Precision Loss
Clock Skew Compensation Algorithm Immune to Floating-Point Precision LossClock Skew Compensation Algorithm Immune to Floating-Point Precision Loss
Clock Skew Compensation Algorithm Immune to Floating-Point Precision Loss
 
Design and Analysis of Algorithms Exam Help
Design and Analysis of Algorithms Exam HelpDesign and Analysis of Algorithms Exam Help
Design and Analysis of Algorithms Exam Help
 
1 Sampling and Signal Reconstruction.pdf
1 Sampling and Signal Reconstruction.pdf1 Sampling and Signal Reconstruction.pdf
1 Sampling and Signal Reconstruction.pdf
 
Case Study(Analysis of Algorithm.pdf
Case Study(Analysis of Algorithm.pdfCase Study(Analysis of Algorithm.pdf
Case Study(Analysis of Algorithm.pdf
 

Mehr von PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 

Mehr von PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 

Kürzlich hochgeladen

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Model based similarity measure in time cloud

  • 1. Model-Based Similarity Measure in TimeCloud Thanh-Nguyen Ngo Hoyoung Jeung Karl Aberer LSIR – IC – EPFL February 2012
  • 2. Ouline Motivation Model-Based Time-Series Model-Based Similarity Measure kNN Processing Experiments Conclusion
  • 3. Motivation The demand for storing and processing massive time-series in the cloud is growing rapidly Measuring a similarity is a fundamental operation in a wide range of applications that process temporally ordered data Computing similar time-series over a large volume of data still remains as a difficult problem
  • 4. Model-Based Time-Series Definition (Time-Series) A time-series t of length n is a temporally ordered sequence t = [t1 , . . . , tn ] where point in time i is mapped to a d-dimensional attribute vector ti = (ti1 , . . . , tid ) of values tij with j ∈ {1, . . . , d}. A time-series is called univariate for d = 1 and multivariate for d > 1.
  • 5. Model-Based Time-Series Definition (Common Points) Two points of two time-series are called common if they occur at the same time. Definition (Common Interval) The common interval of two segments or two time-series is the greatest interval [a, b] such that time a and b belong to both segments or time series. Two segments limited by the common interval are called common segments.
  • 6. Model-Based Similarity Measure Definition (Euclidean Distance) The Euclidean distance between two time-series is also the Euclidean distance of their common segments s = [s1 , . . . , sn ] and t = [t1 , . . . , tn ] of length n, and it is defined as: n Eucl(s, t) = (si − ti )2 i=1
  • 7. Model-Based Similarity Measure Definition (Maximum Error Bound of Time-Series) Given a time-series t = [t1 , . . . , tn ] and its representation t = [t1 , . . . , tn ] in its model. The maximum error bound of t over its model is a value meb(t) such that: |ti − ti | ≤ meb(t), ∀i = 1..n
  • 8. Model-Based Similarity Measure Theorem Given two time-series s, t and their representations s , t in their models. Assume the common segments of s and t have n time series points. Then, √ |Eucl(s, t) − Eucl(s , t )| ≤ n(meb(s) + meb(t))
  • 9. kNN Procesing - The Filter Stage Theorem Let ti and q be representations of ti and q in their models respectively. Let di be the distance between ti and q with the maximum error ei . Let ai = di − ei and bi = di + ei . Without loss of generality, assume b1 ≤ . . . ≤ bn . The candidate set S = {ti |ai ≤ bk } contains k nearest time-series of q and is minimal.
  • 10. kNN Procesing - The Refinement Stage Theorem Let ti and q be representations of ti and q in their models respectively. Let di be the distance between ti and q with the maximum error ei . Let ai = di − ei and bi = di + ei . Without loss of generality, assume a1 ≤ . . . ≤ am . The set R = {ti |bi ≤ am−k+1 } is a subset of the result set.
  • 11. Experiments 2.4GHz Intel Core2 Quad CPU Java implementation, Ubuntu 10.10 Default parameters length of time series: 512 number of nearest neighbors: 10 error ratio: 3% number of time series: 1, 000
  • 13. Effect of Maximum Error Ratios
  • 14. Effect of Number of Nearest Neighbors
  • 15. Effect of Number of Time Series
  • 16. Conclusion Process kNN queries based on model-based similarity measures Establish a set of theoretical foundations for approximated time-series data processing Build query processing mechanisms on the filter-and-refine approach Run more than three times faster than straightforward processing Facilitate scalability of the computation using the TimeCloud system