SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10
Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time
Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data
Yves Caseau
National Academy of Technologies
Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10
Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App
 Knomee is a self-tracking mobile app for iOS (one of many
thousands)
 Knomee motto: « self-tracking with sense »
 Data science applied to self tracking
 Self-tracking apps generate time series
 One or many (up to 4) data points collected over a period of
time
 Data is either self-declared (the user picks a value in a preset
range) or automatically imported from a a connected device
(iPhone’s sensors, Apple watch or any HealthKit compatible
device like a a Withings scale)
 Data files are accessible on:
https://github.com/ycaseau/KnomeeQuest/tree/master/data
 20 samples
 Ranging from 40 to 220 measures (x 4)
Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10
Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user
 Self-tracking is organized around causal diagrams
 A quest is made of a target tracker and up to three
factor trackers
 The user makes the hypothesis that the factors may
contribute to the target
 Using Judea Peal’s notation we look for: usal
 P(X | do(Y)) : impact of doing Y on X
 Detect causality through active experiments
 Correlation is not enough
 A quest is an hypothesis, not all quests are meaningful
 Factor causality is tricky (e.g. coffee as a symptom)
 How to tell if the effort on factors is « worth it » ?
Impact on the target
 Key property of self-tracking data:
some input is purely random
{quest:ENERGY, icloud:true,
energy:{
type:2, more:true,
min:1, max:6, target:4,
labels:[crisis, sleepy, lapses,
normal, energetic, hyper],},
sleep:{
type:7, more:true,
min:4, max:9, target:7,},
steps:{
type:4, more:true,
min:0, max:19000,
target:7000,},
weight:{
type:5, more:false,
min:75, max:82, target:78,},
}
Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10
Short Time-Series ForecastingShort Time-Series Forecasting
 Our goal in this talk : how to forecast values from self-tracking data ?
 Forecasting gives a possible clue about the value of the causal hypothesis
(Granger causality)
 We search for a robust method that does not break with random noise
 Measuring success: iterative training protocol
 For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1]
– Apply forecast to time[i]
– Measure average distance to real value TS[i]
– Compare to « average » performance
 Realistic simulation of what happens in the app
 Why it is hard:
 short samples (small data)
 mixed random inputs
Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10
Classical Methods yield poor resultsClassical Methods yield poor results
 Three classical ML algorithms, trained to
minimize distance, using implicit time
features and factors
 Linear Regression
 K-means Clustering (10 – 15 groups)
 ARMA (AutoRegressive Moving Average)
 Forecasting results are dispapointing
 The difficulty is not a surprise, we are
looking to extract a small amount of
information, only when present
 Improving a few % over average is the best
we can expect
 Overfitting very easily offsets the forecasting
gain
Linar
Regression
K-means ARMA
forecasting 18.34% 19.5% 18.9%
average 17.5% 17.5% 17.5%
Distance
(squares)
0.655 0.81 0.525
Random noise
Linked to factors
Linked to non-
collected factors Random noise
“good quest” “poor quest”
variation
Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10
A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations
 Heuristic toolbox
 MovingAverage – MA(k,discount)
 Trend (time linear regression)
 Weekly and Hourly patterns
 Factor regression with explicit delay
 CumSum (cumulative sum of differences to average)
 Threshold regression with delay
 Combined through a linear algebra
 Each term is a weighted combination of a few heuristics
 Some other heuristics provide improvement with some quests but are left aside for lack
of robustness
 Cycle analysis (detecting “biorhythms”)
 Split (constant until date X, then T)
useful when something changed.
 And(t1,t2) : Boolean conjunction of two factors
Mi x[ 0. 97] (
T[ 2. 25- 2. 02/ - 1. 00] ,
wAvg[ " t ar get " ] ( 10, 1. 00) )
+ Cor [ 0. 04] ( " t r ack2" +16)  
Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10
Distances and RegularizationDistances and Regularization
Time-series operations are weighted
 The weight of each measure is proportional to the
distance to its next neighbor
 Spaced measures are more important than repeated
ones
« Triangular distance »
 The distance between two time series is the area
between the two curves
Regularization to avoid overfitting
 Principle: add a penalty to the distance that reduces
the overall standard deviation
 best formula for this data set
wDist(a,t) + max(0.0, stdev(a) – 0.02)
Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10
Randomized Incremental AlgorithmsRandomized Incremental Algorithms
 Main algorithm is “Randomized Optimization” (RandOpt)
 Create n random algebra terms
 Combination of glutton heuristics (create the best possible term)
 And randomization (coefficients / which sub-term to pick)
 Depth is controlled with a global parameter
 Optimized though local optimization
 Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one
 Hill-climbing local meta heuristics
 Three successive rounds
 This is used in an “incremental mode:
 For each new measure
 Reuse previous best term, and improve through local optimization
 Run ”RandOpt” (100 iterations)
 Keep best term
 What has not worked out so far
 Evolutionary (genetic algorithm with cross-over)
 Mutation (large neighborgood local optimization)
Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10
Computational resultsComputational results
 Average forecast is 16.88% (control = average is 17.5%)
 Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization
 Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10
ConclusionConclusion
Forecasting for self-tracking data is hard
We presented a reinforcement generative
machine learning that performs better than
most classical techniques
This is due to the complex nature of the data
 On (classical) sales time series, ARMA does better than the proposed approach
(close to LR)
 Open question : how to detect the “intrinsic quality” of the quest and change the
forecasting method / regularization parameters accordingly ?
 You can download the data and try your own approaches 
Forecasting is used to two purposes in our mobile app:
 User experience : forecasting makes data entry faster + gives a sense of playfulness
 Granger Causality : when the forecasting score is ”good”, this gives a sense of
plausibility to the causal diagram hypothesis (represented by the “quest”)

Weitere ähnliche Inhalte

Ähnlich wie ML TITLE

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsAM Publications
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMIRJET Journal
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819HODCSE21
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learningbutest
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET Journal
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...IJERA Editor
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationIJECEIAES
 

Ähnlich wie ML TITLE (20)

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVM
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
 

Mehr von Yves Caseau

DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022Yves Caseau
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Yves Caseau
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital TransformationYves Caseau
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the gutsYves Caseau
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018Yves Caseau
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018Yves Caseau
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAYves Caseau
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureYves Caseau
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015Yves Caseau
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013Yves Caseau
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012Yves Caseau
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08Yves Caseau
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Yves Caseau
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Yves Caseau
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Yves Caseau
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014Yves Caseau
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Yves Caseau
 

Mehr von Yves Caseau (20)

CCEM2023.pptx
CCEM2023.pptxCCEM2023.pptx
CCEM2023.pptx
 
DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital Transformation
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the guts
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIA
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT Architecture
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015
 
GTES UTC 2014
GTES  UTC 2014GTES  UTC 2014
GTES UTC 2014
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014
 
Disic mars2014
Disic mars2014Disic mars2014
Disic mars2014
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014
 

Kürzlich hochgeladen

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 

Kürzlich hochgeladen (20)

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 

ML TITLE

  • 1. Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10 Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data Yves Caseau National Academy of Technologies
  • 2. Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10 Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App  Knomee is a self-tracking mobile app for iOS (one of many thousands)  Knomee motto: « self-tracking with sense »  Data science applied to self tracking  Self-tracking apps generate time series  One or many (up to 4) data points collected over a period of time  Data is either self-declared (the user picks a value in a preset range) or automatically imported from a a connected device (iPhone’s sensors, Apple watch or any HealthKit compatible device like a a Withings scale)  Data files are accessible on: https://github.com/ycaseau/KnomeeQuest/tree/master/data  20 samples  Ranging from 40 to 220 measures (x 4)
  • 3. Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10 Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user  Self-tracking is organized around causal diagrams  A quest is made of a target tracker and up to three factor trackers  The user makes the hypothesis that the factors may contribute to the target  Using Judea Peal’s notation we look for: usal  P(X | do(Y)) : impact of doing Y on X  Detect causality through active experiments  Correlation is not enough  A quest is an hypothesis, not all quests are meaningful  Factor causality is tricky (e.g. coffee as a symptom)  How to tell if the effort on factors is « worth it » ? Impact on the target  Key property of self-tracking data: some input is purely random {quest:ENERGY, icloud:true, energy:{ type:2, more:true, min:1, max:6, target:4, labels:[crisis, sleepy, lapses, normal, energetic, hyper],}, sleep:{ type:7, more:true, min:4, max:9, target:7,}, steps:{ type:4, more:true, min:0, max:19000, target:7000,}, weight:{ type:5, more:false, min:75, max:82, target:78,}, }
  • 4. Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10 Short Time-Series ForecastingShort Time-Series Forecasting  Our goal in this talk : how to forecast values from self-tracking data ?  Forecasting gives a possible clue about the value of the causal hypothesis (Granger causality)  We search for a robust method that does not break with random noise  Measuring success: iterative training protocol  For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1] – Apply forecast to time[i] – Measure average distance to real value TS[i] – Compare to « average » performance  Realistic simulation of what happens in the app  Why it is hard:  short samples (small data)  mixed random inputs
  • 5. Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10 Classical Methods yield poor resultsClassical Methods yield poor results  Three classical ML algorithms, trained to minimize distance, using implicit time features and factors  Linear Regression  K-means Clustering (10 – 15 groups)  ARMA (AutoRegressive Moving Average)  Forecasting results are dispapointing  The difficulty is not a surprise, we are looking to extract a small amount of information, only when present  Improving a few % over average is the best we can expect  Overfitting very easily offsets the forecasting gain Linar Regression K-means ARMA forecasting 18.34% 19.5% 18.9% average 17.5% 17.5% 17.5% Distance (squares) 0.655 0.81 0.525 Random noise Linked to factors Linked to non- collected factors Random noise “good quest” “poor quest” variation
  • 6. Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10 A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations  Heuristic toolbox  MovingAverage – MA(k,discount)  Trend (time linear regression)  Weekly and Hourly patterns  Factor regression with explicit delay  CumSum (cumulative sum of differences to average)  Threshold regression with delay  Combined through a linear algebra  Each term is a weighted combination of a few heuristics  Some other heuristics provide improvement with some quests but are left aside for lack of robustness  Cycle analysis (detecting “biorhythms”)  Split (constant until date X, then T) useful when something changed.  And(t1,t2) : Boolean conjunction of two factors Mi x[ 0. 97] ( T[ 2. 25- 2. 02/ - 1. 00] , wAvg[ " t ar get " ] ( 10, 1. 00) ) + Cor [ 0. 04] ( " t r ack2" +16)  
  • 7. Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10 Distances and RegularizationDistances and Regularization Time-series operations are weighted  The weight of each measure is proportional to the distance to its next neighbor  Spaced measures are more important than repeated ones « Triangular distance »  The distance between two time series is the area between the two curves Regularization to avoid overfitting  Principle: add a penalty to the distance that reduces the overall standard deviation  best formula for this data set wDist(a,t) + max(0.0, stdev(a) – 0.02)
  • 8. Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10 Randomized Incremental AlgorithmsRandomized Incremental Algorithms  Main algorithm is “Randomized Optimization” (RandOpt)  Create n random algebra terms  Combination of glutton heuristics (create the best possible term)  And randomization (coefficients / which sub-term to pick)  Depth is controlled with a global parameter  Optimized though local optimization  Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one  Hill-climbing local meta heuristics  Three successive rounds  This is used in an “incremental mode:  For each new measure  Reuse previous best term, and improve through local optimization  Run ”RandOpt” (100 iterations)  Keep best term  What has not worked out so far  Evolutionary (genetic algorithm with cross-over)  Mutation (large neighborgood local optimization)
  • 9. Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10 Computational resultsComputational results  Average forecast is 16.88% (control = average is 17.5%)  Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization  Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
  • 10. Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10 ConclusionConclusion Forecasting for self-tracking data is hard We presented a reinforcement generative machine learning that performs better than most classical techniques This is due to the complex nature of the data  On (classical) sales time series, ARMA does better than the proposed approach (close to LR)  Open question : how to detect the “intrinsic quality” of the quest and change the forecasting method / regularization parameters accordingly ?  You can download the data and try your own approaches  Forecasting is used to two purposes in our mobile app:  User experience : forecasting makes data entry faster + gives a sense of playfulness  Granger Causality : when the forecasting score is ”good”, this gives a sense of plausibility to the causal diagram hypothesis (represented by the “quest”)

Hinweis der Redaktion

  1. CRITICAL : print the version with Notes !