Professor Dimitris Kugiumtzis, Aristotle University of Thessaloniki, Greece, presented this workshop on nonlinear analysis of time series as part of the Summer School on Modern Statisitical Analysis and Computational Methods hosted by the Social Sciences Compuing Hub at the Whitaker Institute, NUI Galway on 17th-19th June 2013.
2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance, part 3
1. Nonlinear analysis of time series
ARMA(p,q) model qtqttptptt zzzxxx 1111
Linear analysis / linear models
Advantages:
1. Simple
2. Gaussian process, established
theory for stochastic processes
and statistical inference
3. Useful in applications
Shortcomings:
1. Cannot explain irregular patterns
in the time series
- data (distribution) asymmetry
- time irreversibility
- «bursts»
2. Deterministic part:
- stable fixed point system
- unstable system
- periodic system
autocorrelation AR model
description of irregular
patterns
explanation / detection of complex
deterministic patterns
Time series, Part 3
Nonlinear analysis of time series
2.
3. ),,,,( 21 tptttt XXXfX A general
nonlinear model
tptttt XXXfX ),,,( 21
additive
noise
p
ptttt
'XXX ,,, 211 X p
f :
f ?
4. tptpttt XXXX 2211
Linear AR
model
Generalizations / extensions of the ΑR model
p ,,, 21
constant (linear ΑR)
random coefficients
- RCA
- BL
constant (linear ΑR, ARMA)
function of Xt
- ARCH
- GARCH
piecewise models
- SETAR
- Markovian
)1()1(
2
)1(
1 ,,, p
)2()2(
2
)2(
1 ,,, p
)()(
2
)(
1 ,,, l
p
ll
5. Self-excited threshold autoregressive models (SETAR)
ll rrrr ,,,, 110
lrrr 10
lRRR 21
lirrR iii ,,1],,( 1
p
Partition of
selection of a lag d,
partition of for dtX
t
j
pt
j
pt
j
t
j
t XXXX )()(
2
)(
21
)(
1
jdt RX
SETAR
when
7. AR models with probabilistic selection of threshold
Exponential autoregressive models (EAR)
tt
j
t
j
t XXX 2
)(
21
)(
1
1με2
με1
j
tt
j
t
j
t XXX 2
)(
21
)(
1
AR models with periodic coefficients
12όταν2
2όταν1
kt
kt
j
1
)1(
1 0)1(
2 0)2(
1 2
)2(
2
Example
8. Markov chain driven AR models
ljJt ,,2,1
The selection of the threshold
is determined by a Markov chain
)|( 1 iJjJP tt
Transition matrix
Example
tt
J
t XX t
1
)( 9.0)1(
9.0)2(
8.02.0
9.01.0
)|( 1 iJjJP tt =
9. Piecewise polynomial models
tptttt XXXfX ),,,( 21
1 2( , , , )t m t t t p tX p X X X
polynomial of
order p and
degree m
Example
2
1 1 1 1(1 )t t t t tX aX X aX aX logistic map1a
aa /)1( Two fixed points: 0 and
Fractional autoregressive models
tq
j
j
tj
p
j
j
tj
t
Xbb
Xaa
X
1
10
1
10
10 qp
0pa
0qb
Example
Fraction of two polynomials
10. random coefficients autoregressive models (RCA)
1 ttt XX AR(1) with multiplicative errors
p
i
titiit XtBbX
1
)( RCA
ib constant
)(tBb iii
)(,),(),( 21 tBtBtB p
independent of
t
tXrandom with mean 0
Example
titit XtBX )(1.0 )9.0,0(~ 2
tB
11. Bilinear models (BL)
BL of order 1: ttttt XbaXX 11
p
i
titiit XtAaX
1
)(
s
k
ktjki btA
1
)(
)(tAa iii coefficients
ts XXts const, tss ,- If linear w.r.t.
“Bilinear” because:
ts Xts const, tsXs ,- If linear w.r.t.
12. AR models with conditional heteroscedasticity
tX ~ ARCH ~ BL 2
tX
ARCH ttt VX
22
11 ptptt XXV 0
0i
Model of multiplicative noise
),0(~ 2
t
GARCH
q
i
iti
p
i
itit VXV
11
2
0i
ttt VX
0
0i
13. Analysis with nonlinear models
1. Model selection
2. Parameter estimation
- maximum likelihood method
- method of ordinary least squares
3. Diagnostic checking
uncorrelated
following normal distribution
rgm m 2)(ˆ|ln2)(AIC xθx
Μ candidate models, m = 1,...,M
errors (rediduals):
14. Real world time series
mechanics
physiology
geophysics economics
15. Nonlinear time series analysis and dynamical systems
Time series 1 2, , , nx x x
Assumption:
: trajectory of the dynamical systemd
ts
0s : state vector at time 0
dd
: t
f system function
t : continuous or discrete time
For time series we assume underlying systems to be dissipative
Trajectory in
d
attractor
d
:h observation function
( )t tx h sobservation :
0( )t
t s f sNonlinear dynamical system
16. Attractor:
● stable fixed (equilibrium) point
● finite set of equilibrium points
● limit cycle
● torus
● strange attractor
self similarity - fractals
chaossensitivity to initial conditions
can be derived by
a linear system
cannot be derived by
a linear system
17. Nonlinear dynamical systems, maps (discrete time)
si = 1 – 1.4 si-1
2 + 0.3si-2
chaotic map Hénon
2
1
1
1
6
4.0exp9.01
k
kk
s
i
iss
chaotic map Ikeda
si = a si-1(1 - si-1)
periodic a=3.52
chaotic a=4
Logistic map
19. Noise in the time series
( )t tx h s
0( )t
t s f s
noise
( )t t tx h w s
observational noise
noise
Observation
Dynamical system
0( )t
t tf s s
dynamic (system) noise
tw : white noise, uncorrelated to andtx ts
t : white noise, uncorrelated to us tu
20. Noise: dynamic (system) ε observational (measurement) w
si = a si-1(1 - si-1)
xi = si + wi, wi ~ N(0,s)
logistic map
si = a si-1(1 - si-1) + εi , εi ~ N(0,s2)
xi = si
chaotic
periodic
23. - Other topics:
- Hypothesis testing for linearity / nonlinearity
- Control system evolution
- Synchronization
- …
- State space reconstruction
in order to observe the complexity / stochasticity / structure
of the system
- Estimation of characteristics of the system / attractor
measuring the complexity / dimension of the system
- Modeling / Prediction
Use nonlinear models to improve predictions
Topics in
the analysis of time series and dynamical systems
24. xi = [xi , xi-t ,…, xi-(m-1)t ]
Method of delays
Parameters
embedding dimension m
delay time t
time window length tw
tw = (m-1)t
We assume that
the studied system
is deterministic
State space reconstruction
initial state
space
M
is
1is
)(1 ii sfs
x
R
observed
quantity
xi = h(si )
h
Embedding
?
1ix
ix
)(1 ii xFx
Rm
reconstructed
state space
xi = F(si )Φ
condition: 12 Dm
27. • From the autocorrelation r(τ)
(measures linear correlation)
τ r(τ) =1/e ή τ r(τ) =0
Estimation of τ
)()(
),(
log),(),(
, ypxp
yxp
yxpYXI
YX
XY
yx
XY
)(),( t
t
IYXI
xYxX ii
• From the mutual information I(τ)
(measures linear and
nonlinear correlation)
τ first local minimum I(τ)
28. • Close points on the attractor are:
- either real neighboring points due to system dynamics
- or false neighboring points due to self-intersections and insufficiently low m
Method of false nearest neighbors (FNN)
Estimation of m
Optimal m ?
R
R2
• Takens theorem:
… but D is unknown
12 Dm
• At a larger m where there are no self-intersections all false neighboring points
will be resolved as they will no longer be close
• The optimal m’ is the one for which there are no longer any false nearest
neighbors as the dimension increases by one from m’ to m’+1.
• Too small m
self-intersection in the attractor
• Too large m
“curse of dimensionality”
29. An example of estimating m by the method FNN
The estimation of m with the method FNN depends on:
- the delay τ
- noise
x-Lorenz without noise
2 4 6 8 10
0
5
10
15
20
25
30
35
40
m
%FNN
FNN, x-lorenz, no-noise
t=2
t=5
t=10
t=20
x-Lorenz + 10% noise
2 4 6 8 10
0
5
10
15
20
25
30
35
40
m
%FNN
FNN, x-lorenz 10% noise
t=2
t=5
t=10
t=20
31. The correlation dimension ν characterizes the fractal structure of the
attractor (self-similarity at different scales) using the density of the points
of the attractor in the reconstructed state space
The basic idea is that the probability of two points being
closer than a distance r
Correlation dimension ν
rji xx
changes w.r.t. r as a power of r
i : number of points lying in a sphere with
radius r and center ix
i i jx
r x x
scaling law
rxi ~
ν integer the attractor is a regular geometric object
ν non-integer attractor is a fractal
holds for
0r N
xi
xi
32. xi
xi
rrC )(Scaling law for small r
Convergence of ν(m) for m sufficiently large
Estimation
dlog ( )
dlog
C r
r
for a range of r
If ν small and non-integer and the system is deterministic
small dimension and fractal (chaotic) structure
Estimation of the correlation dimension ν
Correlation sum
N
i
N
ij
jr
NN
rC
1 1)1(
2
)( xxi
Nii ,,1, xreconstructiontime series , 1, , ( 1)ix i N m t
Estimation of
xi
0 when 0
( )
1 when 0
x
x
x
Heaviside function
33. x-Lorenz + 10% observational noise, τ=2
x-Lorenz + 10% observational noise, τ=10
log C(r) vs log r local slope vs log r ν vs m
x-Lorenz without noise, τ=2
34. The estimation of ν is affected by the following factors:
- correlation time wji
- selection of τ and m
- noise
- time series length
36. The Lyapunov exponents measure the average rate of divergence and convergence
of the trajectories on the attractor at the directions of the local state space
Lyapunov spectrum: m ...21
λi > 0 divergence
λi < 0 convergence
λi = 0 direction of flow
If λ1 > 0 and the system is deterministic
chaos
Lyapunov exponents
Dissipative system :
m
i
i
1
0
37. xi
xi’
xi+t
xi’+t
d0
dt
Largest Lyapunov exponent λ1
Initial distance d0= xi - xi’ of two nearby trajectories is
expected to increase exponentially with time
If
t
t e 1
0
λ1 is the largest
Lyapunov exponent
N
j j
jt
Nt 1 ,0
,
1 ln
1
Computation:
After time t: dt= xi+t - xi’+t
39. The true system generating the time series: )(1 ii sfs
Prediction models
2
1, 1 1, 2,
2, 1 1,
1 1.4
0.3
i i i
i i
s s s
s s
Hénon map
1
1, 2, 1, 1( , ) f
i i is s s
2
1, 2, 2, 1( , ) f
i i is s s
1i if
s s
40. The true system generating the time series: unknown)(1 ii sfs
The problem of modeling and prediction of time series:
given x1, x2, … xi , to estimate / predict xi+1
State space reconstruction
with the method of delays:
xi = [xi, xi-t …, xi-(m-1)t]
Prediction models
The reconstructed system from the time series: estimation?)(1 ii xFx
The function that is relevant to
time series prediction:
)(1 ii xFx
)(1 ii Fx x
mm
:F
m
F :
1 1( , )i i ix F x x m = 2, τ = 1
41. • Semi-local models, e.g. neural networks
the form of function F is derived as a weighted sum of
local basic functions
Nonlinear prediction models
• Global models, e.g. polynomials
function F bears the same analytic expression
for the whole domain
• Local models, e.g. the local linear model
function F is defined differently at each point of the
reconstructed state space
42. Prediction using similar segments of the time series
Prediction at time i+T from the mappings Τ step ahead of
“similar” segments from the past of the time series
43. Local prediction models
Implementation of the idea of “similar” segments:
time series segments reconstructed points
},...,,{ )()2()1( Kiii xxxThe nearest neighboring points to xi:
Prediction of xi+T from the mappings of the neighbors: },...,,{ )()2()1( TKiTiTi xxx
Zeroth order prediction: TiiTi xTxx )1()(ˆ
Average prediction:
K
j
Tjii x
K
Tx
1
)(
1
)(
44. Local linear prediction
We assume that for the neighbor of xi the local linear model is valid :
i
mimii
miiiii
'a
xaxaxaa
xxxFFx
xa
x
0
)1(210
)1(1 ),,,()(
tt
tt
xi(1)+T = a0 + a’ xi(1)
xi(2)+T = a0 + a’ xi(2)
xi(K)+T = a0 + a’ xi(K)
The model holds for
)()2()1( ,...,, Kiii xxx
K
j
mjimjiji
aaa
xaxaax
m
1
2
)1()()(101)(
,,,
)(min
10
t
Estimation of parameters
(method of ordinary least squares)
maaa ,,, 10
45. Estimation of prediction error
We split the time series in two parts:
1 11 2, 1, , , , ,N N Nx x x x x
learning set test set
1 1
ˆ ˆ, ,N Nx xpredictions
ˆi T i T i Te x x
prediction error
N
i
i
TN
Nt
TtTt
xx
N
xx
NTN
T
1
2
1
2
1
1
ˆ
1
)(NRMSE 1
statistic for
prediction error
( )ix T
46. Example: x-Lorenz
• local linear prediction model (LLP)
Prediction with:
• local average prediction model (LAP)
11,5,1 Kmt
without noise
with 10%-noise
47. 0 2 4 6 8 10
0.7
0.8
0.9
1
1.1
m
nrmse(m)
()
AR
LAM(K=15)
LLM(K=15)
Prediction error (nrmse) for the
last 30 quarters
annual- quarter growth rate of GNP of USE in the period 1947 – 1991
164 166 168 170 172 174 176
-0.01
-0.005
0
0.005
0.01
0.015
0.02
()
real
AR(3)
LAM(m=5,K=15)
LLM(m=5,K=15)
Predictions starting at the first
quarter of 1989 with prediction
horizon being the last 6 years
Prediction with
- linear model, AR
- local average model, LAM
- local linear model, LLM
48. Prediction starting at 20/9/2005
and prediction horizon is up to 16 days ahead
ASE index in the period 1/1/2002 – 20/9/2005
Predict index with
- linear model, AR
- local average model, LAM
returns 1
1
t t
t
t
x x
y
x
18 25 02 09 16
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
day
returnsofindex
()
general index returns
y
n
(T), AR(7)
y
n
(T), LAM(m=7,K=20)
index
18 25 02 09 16
3200
3250
3300
3350
3400
3450
day
closeindex
()
general index
xn
(T), AR(7)
xn
(T), LAM(m=7,K=20)
49. One step ahead prediction
in the period 21/9/2005 – 12/10/2005
ASE index in the period 1/1/2002 – 20/9/2005
Predict index with
- linear model, AR
- local average model, LAM
returns 1
1
t t
t
t
x x
y
x
18 25 02 09 16
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
day
indexreturn
()
general index
y
n
(1) AR(7)
y
n
(1) LAM(m=7,K=20)
index
18 25 02 09 16
3200
3250
3300
3350
3400
3450
day
closeindex
()
general index
xn
(1) AR(7)
xn
(1) LAM(m=7,K=20)