ML TITLE

Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10
Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time
Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data
Yves Caseau
National Academy of Technologies

Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App
 Knomee is a self-tracking mobile app for iOS (one of many
thousands)
 Knomee motto: « self-tracking with sense »
 Data science applied to self tracking
 Self-tracking apps generate time series
 One or many (up to 4) data points collected over a period of
time
 Data is either self-declared (the user picks a value in a preset
range) or automatically imported from a a connected device
(iPhone’s sensors, Apple watch or any HealthKit compatible
device like a a Withings scale)
 Data files are accessible on:
https://github.com/ycaseau/KnomeeQuest/tree/master/data
 20 samples
 Ranging from 40 to 220 measures (x 4)

Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user
 Self-tracking is organized around causal diagrams
 A quest is made of a target tracker and up to three
factor trackers
 The user makes the hypothesis that the factors may
contribute to the target
 Using Judea Peal’s notation we look for: usal
 P(X | do(Y)) : impact of doing Y on X
 Detect causality through active experiments
 Correlation is not enough
 A quest is an hypothesis, not all quests are meaningful
 Factor causality is tricky (e.g. coffee as a symptom)
 How to tell if the effort on factors is « worth it » ?
Impact on the target
 Key property of self-tracking data:
some input is purely random
{quest:ENERGY, icloud:true,
energy:{
type:2, more:true,
min:1, max:6, target:4,
labels:[crisis, sleepy, lapses,
normal, energetic, hyper],},
sleep:{
type:7, more:true,
min:4, max:9, target:7,},
steps:{
type:4, more:true,
min:0, max:19000,
target:7000,},
weight:{
type:5, more:false,
min:75, max:82, target:78,},
}

Short Time-Series ForecastingShort Time-Series Forecasting
 Our goal in this talk : how to forecast values from self-tracking data ?
 Forecasting gives a possible clue about the value of the causal hypothesis
(Granger causality)
 We search for a robust method that does not break with random noise
 Measuring success: iterative training protocol
 For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1]
– Apply forecast to time[i]
– Measure average distance to real value TS[i]
– Compare to « average » performance
 Realistic simulation of what happens in the app
 Why it is hard:
 short samples (small data)
 mixed random inputs

Classical Methods yield poor resultsClassical Methods yield poor results
 Three classical ML algorithms, trained to
minimize distance, using implicit time
features and factors
 Linear Regression
 K-means Clustering (10 – 15 groups)
 ARMA (AutoRegressive Moving Average)
 Forecasting results are dispapointing
 The difficulty is not a surprise, we are
looking to extract a small amount of
information, only when present
 Improving a few % over average is the best
we can expect
 Overfitting very easily offsets the forecasting
gain
Linar
Regression
K-means ARMA
forecasting 18.34% 19.5% 18.9%
average 17.5% 17.5% 17.5%
Distance
(squares)
0.655 0.81 0.525
Random noise
Linked to factors
Linked to non-
collected factors Random noise
“good quest” “poor quest”
variation

A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations
 Heuristic toolbox
 MovingAverage – MA(k,discount)
 Trend (time linear regression)
 Weekly and Hourly patterns
 Factor regression with explicit delay
 CumSum (cumulative sum of differences to average)
 Threshold regression with delay
 Combined through a linear algebra
 Each term is a weighted combination of a few heuristics
 Some other heuristics provide improvement with some quests but are left aside for lack
of robustness
 Cycle analysis (detecting “biorhythms”)
 Split (constant until date X, then T)
useful when something changed.
 And(t1,t2) : Boolean conjunction of two factors
Mi x[ 0. 97] (
T[ 2. 25- 2. 02/ - 1. 00] ,
wAvg[ " t ar get " ] ( 10, 1. 00) )
+ Cor [ 0. 04] ( " t r ack2" +16)

Distances and RegularizationDistances and Regularization
Time-series operations are weighted
 The weight of each measure is proportional to the
distance to its next neighbor
 Spaced measures are more important than repeated
ones
« Triangular distance »
 The distance between two time series is the area
between the two curves
Regularization to avoid overfitting
 Principle: add a penalty to the distance that reduces
the overall standard deviation
 best formula for this data set
wDist(a,t) + max(0.0, stdev(a) – 0.02)

Randomized Incremental AlgorithmsRandomized Incremental Algorithms
 Main algorithm is “Randomized Optimization” (RandOpt)
 Create n random algebra terms
 Combination of glutton heuristics (create the best possible term)
 And randomization (coefficients / which sub-term to pick)
 Depth is controlled with a global parameter
 Optimized though local optimization
 Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one
 Hill-climbing local meta heuristics
 Three successive rounds
 This is used in an “incremental mode:
 For each new measure
 Reuse previous best term, and improve through local optimization
 Run ”RandOpt” (100 iterations)
 Keep best term
 What has not worked out so far
 Evolutionary (genetic algorithm with cross-over)
 Mutation (large neighborgood local optimization)

Computational resultsComputational results
 Average forecast is 16.88% (control = average is 17.5%)
 Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization
 Strong measures against overfitting (regularization, depth, # local opt loops + techniques)

ConclusionConclusion
Forecasting for self-tracking data is hard
We presented a reinforcement generative
machine learning that performs better than
most classical techniques
This is due to the complex nature of the data
 On (classical) sales time series, ARMA does better than the proposed approach
(close to LR)
 Open question : how to detect the “intrinsic quality” of the quest and change the
forecasting method / regularization parameters accordingly ?
 You can download the data and try your own approaches 
Forecasting is used to two purposes in our mobile app:
 User experience : forecasting makes data entry faster + gives a sense of playfulness
 Granger Causality : when the forecasting score is ”good”, this gives a sense of
plausibility to the causal diagram hypothesis (represented by the “quest”)

ML TITLE

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie ML TITLE

Ähnlich wie ML TITLE (20)

Mehr von Yves Caseau

Mehr von Yves Caseau (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

ML TITLE

Hinweis der Redaktion