Blog with video: https://elementary-science.netlify.com/time-series-prediction-gpu/
The task of time series prediction is crucial for a huge amount of practical applications ranging from industrial capacity planning to trading/wealth management and particularly in the field of automatic anomaly detection in business metrics where Anodot is a specialist. How to do perform this task? Which algorithm to use? How to be fast enough for real-time application? I will present in this talk an elegant strategy and I will explain how to use GPU to perform this task in real time. Performance and precision of the solution will be discussed. Practical applications will be demonstrated. A comparison between different methods will be also investigated.
5. 5
SELECTED CUSTOMERS
Pedro Silva, Senior product manager,
Credit Karma
“ “It used to take us up to several days to
identify an issue on a specific page, offer,
or service that was draining our revenues.
Anodot identifies when a metric increases
or decreases in real time, so we can
resolve it quickly, before business suffers
or revenue is lost.
7. 7
OVERVIEW
• What I will not talk today:
o Model identification : How to find the model that fit to the observed data
o Estimation : How to find the parameters of the model from de observed data
o Politics, Cinema…
• What I will talk about today:
o I already have a model, how to forecast the future values
• How ?
o I choose a “toy model”
o I will compare two prediction methodologies (mathematics / algorithms)
o I will compare two code implementations (engineering)
8. 8
THE PREDICTION TASK
• Depends of the horizon
• Most of the time we need only the
expectancy
• Prediction error is always useful
• For some use cases, tails are
important.
• The most general case is the
distribution according to time: All
needed values can be easily
computed from the distribution.
9. 9
THE “TOY MODEL”
• For mathematicians: Ornstein-Ulhembeck
• For physicist : Langevin or Einstein model (gas kinetic theory)
• For bankers : Vasicek model (Interest rate model)
• For Anodot ?
Small increment of the
process
Small increment of a Brownian
motion, Gaussian noise
Deterministic part Random part, noise model
11. 11
THE IDEA
• We are discretizing the model continuous model
• Simulate thousands of trajectories with a random number generator
• For each time slice, we are computing the histogram
✓This is a approximation of the distribution if the enough trajectories
12. 12
FROM CONTINOUS TO DISCRETE
Discretization
The autoregressive process,
ARIMA(1, 0, 0)
16. 16
THE IDEA
• From the continuous model we are using the Fokker-Planck theorem.
We are obtaining a partial differential equation (PDE) for the distribution.
• We are solving numerically this equation.
18. 18
THE MAGIC BRIDGE
Stochastic process, Time series Partial differential equations
Fundamentally random Fundamentally deterministic
Mathematical tools : Stochastic calculus Mathematical tools : Standard analysis
Stochastic process, Time series Partial differential equations
The Fokker-Plank /
Kolmogorov forward
23. 23
NUMERICAL RESULTS
CPU GPU
Monte-Carlo 2.78 s
Fokker-Planck 19.20 ms +/- 0.44 ms 4.4 us +/- 3.4 us
145x
4372x ~ 80x (parallel algo ) * 50x hardware
632068 x = 6.3E5 x
State of the art: Implemented in
Facebook prophet and other …
24. 24
PRICE COMPARAISON FOR 1M SERIES
• CPU: AWS On demand, m3.2xlarge, North Virginia , $0.532 per Hour
• GPU: AWS On demand, g2.2xlarge, North Virginia , $0.650 per Hour
CPU (Multithreaded 8 cores) GPU (IO not included)
Monte-Carlo 77094 h = 5859 $
Fokker-Planck 76h = 41 $ 7.31 min , less than 0.1 $ !!!
25. 25
CONCLUSION
• The continuous twin of the AR(1) process is the Ornstein-Ulhembeck process
• How to derive an PDE for the distribution of the process
• The Fokker-Planck method is always faster than Mont-Carlo
• The prediction task is easily parallelizable on GPU
• We break the state of the art by five order of magnitude