1. Piecewise Gaussian Process Modelling for
Change-Point Detection
Application to Atmospheric Dispersion Problems
Adrien Ickowicz
CMIS
CSIRO
February 2013
2. Background
Scientic collaboration with the University College London, the
UNSW and Universite Lille 1.
Atmospheric specialists;
Informatics engineer;
Statisticians.
Input
Concentration value of CBRN material at sensors location;
Wind eld.
Output
Source location, time of release, strength for Fire-ghters;
Quarantine Map for Politicians and MoD.
3. Statistical Modelling
Observation modelling:
obs (i )
Yt j =
(i )
Dtj
i
(θ) + ζtj
Cθ (x , t )h(x , t |xi , tj )dxdt i
ζtj ∼ N (0, σ 2 )
Ω×T
where Cθ is the solution of the pde:
∂C
+u C − (K C) = Q (θ)
∂t
s.t. nC = 0 at ∂Ω
Parameter of interest: θ ∈ (Ω × T )
4. Existing Techniques
Source term estimation
The Optimization techniques.
Gradient-based methods
(Elbern et al [2000], Li and Niu [2005], Lushi and Stockie [2010])
Patern search methods
(Zheng et al [2008])
Genetic Algorithms
(Haupt [2005], Allen et al [2009])
The Bayesian techniques.
Forward modelling and MCMC
(Patwardhan and Small [1992])
Backward (Adjoint) modelling and MCMC
(Issartel et al [2002], Hourdin et al [2006], Yee [2010])
5. Contribution : Gaussian Process modelling
Overview
We consider several observations of a stochastic process in space
and time.
Idea: Bayesian non-parametric estimation.
Tool: Gaussian Process (Rasmussen [2006])
Joint distribution: y ∼ GP(m(x), κ(x, x ))
m ∈ L2 (Ω × T , R) is the prior mean function,
and κ ∈ L2 (Ω2 × T 2 , R) is the prior covariance function1
Posterior distribution: L y∗ |x∗ , x, y = N κ(x∗ , x)κ(x, x)−1 y,
κ(x∗ , x∗ ) − κ(x∗ , x)κ(x, x)−1 κ(x, x∗ )
1 the matrix K associated should be positive semidenite
6. Contribution : Gaussian Process modelling
On the Kernel Specication
A complex non parametric modelling needs to be very careful on kernel
shape and kernel hyper-parameters.
Basic Kernel: Isotropic, κ(x, x ) = α1 exp − 1
2α2
(x − x )2
Hyper-parameters: α1 , α2
3
3
3
2
2
2
1
1
1
0
0
0
−1
−1
−1
−2
−2
−2
Figure: Prediction of 3 Gaussian Process Models (and their according 0.95 CI) given 7
noisy observations. On the left, α2 = 0.1. In the middle, α2 = 2. On the right,
α2 = 1000.
7. Contribution : Gaussian Process modelling
Likelihood and Multiple Kernels
The hyper-parameters estimation is provided through the marginal
likelihood,
log p (y|x) = − 1 yT (K + σ 2 In )−1 y − 1 log |K + σ 2 In | − n log 2π
2 2 2
What if the best-tted kernel was,
κ(x, x ) = i
κi (x, x )1{x,x }∈
i
Figure: Synthetic two-phase signal.
8. Contribution : Gaussian Process modelling
Change-Point Estimation
A. Parametric Estimation
We assume that there exist βi such that,
(x , x ) ∈ Ωi ⇔ f (x , x , βi ) ≥ 0
and f is known. Then, θ = {(αi , βi )i }, and we have,
θ = argmax
ˆ log p (y|x)
θ
Limitations:
Knowledge of f
Dimension of the parameter space
Convexity of the marginal likelihood function
9. Contribution : Gaussian Process modelling
Change-Point Estimation
B. Adaptive Estimation (1)
Let XkNN ∩Br (i ) the sequence of observations associated with xi ,
XkNN ∩Br (i ) = xj |{xj ∈ Bir } ∩ {dji ≤ d(ik ) }
k is the number of neighbours to be considered,
r is the limiting radius.
Justication:
Avoid the lack of observations
Equivalent number of observations for each estimator
Avoid the hyper-parametrization of the likelihood
10. Contribution : Gaussian Process modelling
Change-Point Estimation
B. Adaptive Estimation (2)
Let xI = XkNN ∩Br (i ) and yI be the corresponding observations.
αi = argmax
ˆ log p (yI |xI )
α
Idea 1: Idea 2:
Cluster on αi
ˆ Build the Gram matrices Ki = κ(xI , αi )
ˆ
xi xi
Let Λxi = {λ1 . . . λn } be the eigenvalues of
but what if dim(ˆ i ) ≥ 2 ?
α Ki
Cluster on µi = max{Λxi }
11. Contribution : Gaussian Process modelling
Simulation Results
Figure: Gaussian Process prediction with 1 classical isotropic kernel (green), 2 isotropic kernels with eigenvalue-based
change point estimation (yellow), hyper-parameter-based change point estimation (purple) and parametric estimation (blue).
50 50
45 45
40 40
35 35
30 30
25 25
20 20
15 15
10 10
5 5
0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Figure: Mean of the Gaussian Process for the two-dimensional scenario. On the left, the mean is calculated with only one
kernel. On the right, the mean is calculated with two kernels.
12. Contribution : Gaussian Process modelling
Simulation Results
10
Evolution of the Root MSE of the
Change-point Estimation when the
8
number of observations increase
RMSE
6
from 20 to 100, in the 1D case.
4
MMLE
2
JD
0
10 20 30 40 50
MEV
Ns
Methods:
2D 2D-donut 3D
Parametric JD 0.834 (0.0034) 0.763 (0.0015) 0.666 (0.0016)
-MMLE,
approach MEV 0.825 (0.0053) 0.817 (0.0021) 0.643 (0.0014)
-MEV, EigenValue MMLE 0.858 (0.0025) 0.806 (0.0008) 0.666 (0.0002)
approach
-JD, Est. approach Table: The number of obs. is equal to 10d , where d is the dimension of the problem. 1000
simulations are provided. The variance is specied under brackets.
13. Contribution : Gaussian Process modelling
Application to the Concentration Measurements
We may consider the concentration measurements as observations
of a stochastic process in space and time.
Idea: Apply the dened approach to estimate t0 .
Prior distribution: C ∼ GP(m, κ)
m ∈ L2 (Ω × T , R) is the prior mean function,
and κ ∈ L2 (Ω2 × T 2 , R) is the prior covariance function2
Posterior distribution: C|Y ,m=0 ∼ GP(κx ∗ x κ−1 Y , κx ∗ x ∗ − κx ∗ x κ−1 κxx ∗ )
xx xx
2 the matrix K associated should be positive semidenite
14. Contribution : Gaussian Process modelling
Kernel Specication
Isotropic Kernel Drif-dependant Kernel
x
˙ = u (x , t )
1 x−x 2 x (t 0 ) = x0
κiso x, x = exp −
α β2
sx0 ,t0 (t ) is the solution of this system.
where α and β are hyper-parameters.
1 ds (x, x )
κdyn x, x = exp −
σ(t , t ) 2σ(t , t )2
where we have:
ds (x, x ) = (x − sx ,t (t ))2 + (x − sx ,t (t ))2
σ(t , t ) = α × (|t0 − min(t , t )| + 1)β
Consider the inuence of the wind eld
Consider the time-decreasing correlation
Consider the evolution of the process
15. Contribution : Gaussian Process modelling
Two Stage estimation process: Instant of Release
The proposed kernel is then complex:
κf = κiso 1{t ,t t } + κdyn 1{t ,t ≥t }
The likelihood is not convex.
0 0
t0 has to be estimated separately.
Maximum Likelihood Estimation of
Hyperparameters
Method: Exhaustive research of t0 .
Calculation of the trace of the Gram
matrix.
ˆ tr = argmax tr (K (t ))
t0
t ∈T
16. Contribution : Gaussian Process modelling
Two Stage estimation process: Source location
Given the time of release, we can Estimation of the source location. Comparison between the
calculate the location estimation. estimators (5, 20 and 50 sensors). Target is x0 = 115, y0 = 10.
x0
ˆ y0
ˆ σ(x0 )
ˆ σ(y0 )
ˆ
x0
ˆ = argmax E[C|Y ,m=0 (x , tˆ )]
0 κiso 5 68.97 62.58 42.82 38.96
x ∈Ω
20 97.13 26.37 27.64 26.08
= argmax κx ∗ x κ−1 Y
˜ ˜ xx
50 104.47 21.60 28.94 19.47
x ∈Ω
κf 5 108.94 12.21 42.00 17.05
where κ = κ(., tˆ )
˜ 0 20 120.28 8.28 12.50 4.64
50 114.51 9.48 6.37 3.07
17. Contribution : Gaussian Process modelling
Zero-Inated Poisson and Dirichlet Process3
We can also consider the concentration as a count of particles.
Y ∼ ZIP (p , λ)
p ∼ DP (H , α) log λ ∼ GP (m, κ)
which then dene the mixture distribution,
−λxt
e k
Pr (Y = k |p , λ) = pxt 1{Y =0} + (1 − pxt ) λxt 1{Y =k }
k!
k
Major Issue: the tractability of the likelihood calculation relies on the distribution of
both p and λ.
3 Joint work with Dr. G .Peters and Dr. I. Nevat
18. Contribution : Bibliography
A. Ickowicz, F. Septier, P. Armand, Adaptive Algorithms for the
Estimation of Source Term in a Complex Atmospheric Release.
Submitted to Atmospheric Environment Journal
A. Ickowicz, F. Septier, P. Armand, Estimating a CBRN atmospheric
release in a complex environment using Gaussian Processes.
15th international conference on information fusion, Singapore, Singapore,
July 2012
F. Septier, A. Ickowicz, P. Armand, Methodes de Monte-Carlo adaptatives
pour la caractérisation de termes de sources.
Technical report, CEA, EOTP A-54300-05-07-AW-26, Mar. 2012
A. Ickowicz, F. Septier, P. Armand, Statistic Estimation for Particle
Clouds with Lagrangian Stochastic Algorithms.
Technical report, CEA, EOTP A-24300-01-01-AW-20, Nov. 2011