SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10
Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time
Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data
Yves Caseau
National Academy of Technologies
Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10
Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App
 Knomee is a self-tracking mobile app for iOS (one of many
thousands)
 Knomee motto: « self-tracking with sense »
 Data science applied to self tracking
 Self-tracking apps generate time series
 One or many (up to 4) data points collected over a period of
time
 Data is either self-declared (the user picks a value in a preset
range) or automatically imported from a a connected device
(iPhone’s sensors, Apple watch or any HealthKit compatible
device like a a Withings scale)
 Data files are accessible on:
https://github.com/ycaseau/KnomeeQuest/tree/master/data
 20 samples
 Ranging from 40 to 220 measures (x 4)
Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10
Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user
 Self-tracking is organized around causal diagrams
 A quest is made of a target tracker and up to three
factor trackers
 The user makes the hypothesis that the factors may
contribute to the target
 Using Judea Peal’s notation we look for: usal
 P(X | do(Y)) : impact of doing Y on X
 Detect causality through active experiments
 Correlation is not enough
 A quest is an hypothesis, not all quests are meaningful
 Factor causality is tricky (e.g. coffee as a symptom)
 How to tell if the effort on factors is « worth it » ?
Impact on the target
 Key property of self-tracking data:
some input is purely random
{quest:ENERGY, icloud:true,
energy:{
type:2, more:true,
min:1, max:6, target:4,
labels:[crisis, sleepy, lapses,
normal, energetic, hyper],},
sleep:{
type:7, more:true,
min:4, max:9, target:7,},
steps:{
type:4, more:true,
min:0, max:19000,
target:7000,},
weight:{
type:5, more:false,
min:75, max:82, target:78,},
}
Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10
Short Time-Series ForecastingShort Time-Series Forecasting
 Our goal in this talk : how to forecast values from self-tracking data ?
 Forecasting gives a possible clue about the value of the causal hypothesis
(Granger causality)
 We search for a robust method that does not break with random noise
 Measuring success: iterative training protocol
 For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1]
– Apply forecast to time[i]
– Measure average distance to real value TS[i]
– Compare to « average » performance
 Realistic simulation of what happens in the app
 Why it is hard:
 short samples (small data)
 mixed random inputs
Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10
Classical Methods yield poor resultsClassical Methods yield poor results
 Three classical ML algorithms, trained to
minimize distance, using implicit time
features and factors
 Linear Regression
 K-means Clustering (10 – 15 groups)
 ARMA (AutoRegressive Moving Average)
 Forecasting results are dispapointing
 The difficulty is not a surprise, we are
looking to extract a small amount of
information, only when present
 Improving a few % over average is the best
we can expect
 Overfitting very easily offsets the forecasting
gain
Linar
Regression
K-means ARMA
forecasting 18.34% 19.5% 18.9%
average 17.5% 17.5% 17.5%
Distance
(squares)
0.655 0.81 0.525
Random noise
Linked to factors
Linked to non-
collected factors Random noise
“good quest” “poor quest”
variation
Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10
A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations
 Heuristic toolbox
 MovingAverage – MA(k,discount)
 Trend (time linear regression)
 Weekly and Hourly patterns
 Factor regression with explicit delay
 CumSum (cumulative sum of differences to average)
 Threshold regression with delay
 Combined through a linear algebra
 Each term is a weighted combination of a few heuristics
 Some other heuristics provide improvement with some quests but are left aside for lack
of robustness
 Cycle analysis (detecting “biorhythms”)
 Split (constant until date X, then T)
useful when something changed.
 And(t1,t2) : Boolean conjunction of two factors
Mi x[ 0. 97] (
T[ 2. 25- 2. 02/ - 1. 00] ,
wAvg[ " t ar get " ] ( 10, 1. 00) )
+ Cor [ 0. 04] ( " t r ack2" +16)  
Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10
Distances and RegularizationDistances and Regularization
Time-series operations are weighted
 The weight of each measure is proportional to the
distance to its next neighbor
 Spaced measures are more important than repeated
ones
« Triangular distance »
 The distance between two time series is the area
between the two curves
Regularization to avoid overfitting
 Principle: add a penalty to the distance that reduces
the overall standard deviation
 best formula for this data set
wDist(a,t) + max(0.0, stdev(a) – 0.02)
Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10
Randomized Incremental AlgorithmsRandomized Incremental Algorithms
 Main algorithm is “Randomized Optimization” (RandOpt)
 Create n random algebra terms
 Combination of glutton heuristics (create the best possible term)
 And randomization (coefficients / which sub-term to pick)
 Depth is controlled with a global parameter
 Optimized though local optimization
 Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one
 Hill-climbing local meta heuristics
 Three successive rounds
 This is used in an “incremental mode:
 For each new measure
 Reuse previous best term, and improve through local optimization
 Run ”RandOpt” (100 iterations)
 Keep best term
 What has not worked out so far
 Evolutionary (genetic algorithm with cross-over)
 Mutation (large neighborgood local optimization)
Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10
Computational resultsComputational results
 Average forecast is 16.88% (control = average is 17.5%)
 Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization
 Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10
ConclusionConclusion
Forecasting for self-tracking data is hard
We presented a reinforcement generative
machine learning that performs better than
most classical techniques
This is due to the complex nature of the data
 On (classical) sales time series, ARMA does better than the proposed approach
(close to LR)
 Open question : how to detect the “intrinsic quality” of the quest and change the
forecasting method / regularization parameters accordingly ?
 You can download the data and try your own approaches 
Forecasting is used to two purposes in our mobile app:
 User experience : forecasting makes data entry faster + gives a sense of playfulness
 Granger Causality : when the forecasting score is ”good”, this gives a sense of
plausibility to the causal diagram hypothesis (represented by the “quest”)

Weitere ähnliche Inhalte

Ähnlich wie Machine Learning for Self-Tracking

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsAM Publications
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMIRJET Journal
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819HODCSE21
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learningbutest
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET Journal
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...IJERA Editor
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationIJECEIAES
 

Ähnlich wie Machine Learning for Self-Tracking (20)

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVM
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
 

Mehr von Yves Caseau

DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022Yves Caseau
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Yves Caseau
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital TransformationYves Caseau
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the gutsYves Caseau
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018Yves Caseau
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018Yves Caseau
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAYves Caseau
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureYves Caseau
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015Yves Caseau
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013Yves Caseau
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012Yves Caseau
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08Yves Caseau
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Yves Caseau
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Yves Caseau
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Yves Caseau
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014Yves Caseau
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Yves Caseau
 

Mehr von Yves Caseau (20)

CCEM2023.pptx
CCEM2023.pptxCCEM2023.pptx
CCEM2023.pptx
 
DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital Transformation
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the guts
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIA
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT Architecture
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015
 
GTES UTC 2014
GTES  UTC 2014GTES  UTC 2014
GTES UTC 2014
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014
 
Disic mars2014
Disic mars2014Disic mars2014
Disic mars2014
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014
 

Kürzlich hochgeladen

FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfSuchita Rawat
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfbyp19971001
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfhoangquan21999
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfPharmatech-rx
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfpablovgd
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxPOST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxArpitaMishra69
 
MSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdfMSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdfSuchita Rawat
 
Costs to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaCosts to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaTimothyOkuna
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxArunLakshmiMeenakshi
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!University of Hertfordshire
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einsteinxgamestudios8
 
PARENTAL CARE IN FISHES.pptx for 5th sem
PARENTAL CARE IN FISHES.pptx for 5th semPARENTAL CARE IN FISHES.pptx for 5th sem
PARENTAL CARE IN FISHES.pptx for 5th semborkhotudu123
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysBrahmesh Reddy B R
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algaekushbuR
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxMAGOTI ERNEST
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismAreesha Ahmad
 

Kürzlich hochgeladen (20)

FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxPOST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
 
MSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdfMSC IV_Forensic medicine - Mechanical injuries.pdf
MSC IV_Forensic medicine - Mechanical injuries.pdf
 
Costs to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaCosts to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of Uganda
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 
PARENTAL CARE IN FISHES.pptx for 5th sem
PARENTAL CARE IN FISHES.pptx for 5th semPARENTAL CARE IN FISHES.pptx for 5th sem
PARENTAL CARE IN FISHES.pptx for 5th sem
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree days
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) Metabolism
 

Machine Learning for Self-Tracking

  • 1. Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10 Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data Yves Caseau National Academy of Technologies
  • 2. Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10 Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App  Knomee is a self-tracking mobile app for iOS (one of many thousands)  Knomee motto: « self-tracking with sense »  Data science applied to self tracking  Self-tracking apps generate time series  One or many (up to 4) data points collected over a period of time  Data is either self-declared (the user picks a value in a preset range) or automatically imported from a a connected device (iPhone’s sensors, Apple watch or any HealthKit compatible device like a a Withings scale)  Data files are accessible on: https://github.com/ycaseau/KnomeeQuest/tree/master/data  20 samples  Ranging from 40 to 220 measures (x 4)
  • 3. Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10 Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user  Self-tracking is organized around causal diagrams  A quest is made of a target tracker and up to three factor trackers  The user makes the hypothesis that the factors may contribute to the target  Using Judea Peal’s notation we look for: usal  P(X | do(Y)) : impact of doing Y on X  Detect causality through active experiments  Correlation is not enough  A quest is an hypothesis, not all quests are meaningful  Factor causality is tricky (e.g. coffee as a symptom)  How to tell if the effort on factors is « worth it » ? Impact on the target  Key property of self-tracking data: some input is purely random {quest:ENERGY, icloud:true, energy:{ type:2, more:true, min:1, max:6, target:4, labels:[crisis, sleepy, lapses, normal, energetic, hyper],}, sleep:{ type:7, more:true, min:4, max:9, target:7,}, steps:{ type:4, more:true, min:0, max:19000, target:7000,}, weight:{ type:5, more:false, min:75, max:82, target:78,}, }
  • 4. Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10 Short Time-Series ForecastingShort Time-Series Forecasting  Our goal in this talk : how to forecast values from self-tracking data ?  Forecasting gives a possible clue about the value of the causal hypothesis (Granger causality)  We search for a robust method that does not break with random noise  Measuring success: iterative training protocol  For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1] – Apply forecast to time[i] – Measure average distance to real value TS[i] – Compare to « average » performance  Realistic simulation of what happens in the app  Why it is hard:  short samples (small data)  mixed random inputs
  • 5. Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10 Classical Methods yield poor resultsClassical Methods yield poor results  Three classical ML algorithms, trained to minimize distance, using implicit time features and factors  Linear Regression  K-means Clustering (10 – 15 groups)  ARMA (AutoRegressive Moving Average)  Forecasting results are dispapointing  The difficulty is not a surprise, we are looking to extract a small amount of information, only when present  Improving a few % over average is the best we can expect  Overfitting very easily offsets the forecasting gain Linar Regression K-means ARMA forecasting 18.34% 19.5% 18.9% average 17.5% 17.5% 17.5% Distance (squares) 0.655 0.81 0.525 Random noise Linked to factors Linked to non- collected factors Random noise “good quest” “poor quest” variation
  • 6. Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10 A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations  Heuristic toolbox  MovingAverage – MA(k,discount)  Trend (time linear regression)  Weekly and Hourly patterns  Factor regression with explicit delay  CumSum (cumulative sum of differences to average)  Threshold regression with delay  Combined through a linear algebra  Each term is a weighted combination of a few heuristics  Some other heuristics provide improvement with some quests but are left aside for lack of robustness  Cycle analysis (detecting “biorhythms”)  Split (constant until date X, then T) useful when something changed.  And(t1,t2) : Boolean conjunction of two factors Mi x[ 0. 97] ( T[ 2. 25- 2. 02/ - 1. 00] , wAvg[ " t ar get " ] ( 10, 1. 00) ) + Cor [ 0. 04] ( " t r ack2" +16)  
  • 7. Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10 Distances and RegularizationDistances and Regularization Time-series operations are weighted  The weight of each measure is proportional to the distance to its next neighbor  Spaced measures are more important than repeated ones « Triangular distance »  The distance between two time series is the area between the two curves Regularization to avoid overfitting  Principle: add a penalty to the distance that reduces the overall standard deviation  best formula for this data set wDist(a,t) + max(0.0, stdev(a) – 0.02)
  • 8. Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10 Randomized Incremental AlgorithmsRandomized Incremental Algorithms  Main algorithm is “Randomized Optimization” (RandOpt)  Create n random algebra terms  Combination of glutton heuristics (create the best possible term)  And randomization (coefficients / which sub-term to pick)  Depth is controlled with a global parameter  Optimized though local optimization  Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one  Hill-climbing local meta heuristics  Three successive rounds  This is used in an “incremental mode:  For each new measure  Reuse previous best term, and improve through local optimization  Run ”RandOpt” (100 iterations)  Keep best term  What has not worked out so far  Evolutionary (genetic algorithm with cross-over)  Mutation (large neighborgood local optimization)
  • 9. Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10 Computational resultsComputational results  Average forecast is 16.88% (control = average is 17.5%)  Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization  Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
  • 10. Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10 ConclusionConclusion Forecasting for self-tracking data is hard We presented a reinforcement generative machine learning that performs better than most classical techniques This is due to the complex nature of the data  On (classical) sales time series, ARMA does better than the proposed approach (close to LR)  Open question : how to detect the “intrinsic quality” of the quest and change the forecasting method / regularization parameters accordingly ?  You can download the data and try your own approaches  Forecasting is used to two purposes in our mobile app:  User experience : forecasting makes data entry faster + gives a sense of playfulness  Granger Causality : when the forecasting score is ”good”, this gives a sense of plausibility to the causal diagram hypothesis (represented by the “quest”)

Hinweis der Redaktion

  1. CRITICAL : print the version with Notes !