Statistical Modeling of Extreme Values

Statistical Moeling of Extreme
Values: Basic Theory and Its
Implementation in Open Source
Programing Environment R
Nader Tajvidi
Department of Mathematical Statistics
Lund Institute of Technology
Box 118
SE-22100 Lund
Sweden

August 6, 2010

Khon Kaen University

Outline

• Some examples of application of extreme value
theory

• Univariate extreme value distributions

• Characterisation of multivariate extreme value
distributions

• Bivariate extreme value distributions

• Parametric models for the dependence function

• Parametric and nonparametric estimation of the
dependence function

• Monte Carlo approximations to mean integrated
squared errors of parametric and nonparametric
estimators

• Application to Australian temperature data

Khon Kaen University August 6, 2010

Annual maximum sea levels at Port Pirie, South
Australia

4.6

4.4

4.2

4.0

Sea−Level (meters)
3.8

3.6

1930 1940 1950 1960 1970 1980

Year


Breaking strengths of glass ﬁbers

Histogram of breaking strengths of glass fibers

Percent of Total 30

20

10

0

0.5 1.0 1.5 2.0

Breaking Strength

Density plot of breaking strengths of glass fibers

1.5
Density

1.0

0.5

0.0

0.5 1.0 1.5 2.0 2.5

Breaking Strength


Annual maximum sea levels at Fremantle, Western
Australia

1.8

1.6

1.4

1.2

1900 1920 1940 1960 1980

Year


Annual maximum sea levels at Fremantle, Western
Australia, versus mean annual value of Southern
Oscillation Index

1.8

1.6

1.4

1.2

−1 0 1 2

SOI


Comparing Port Pirie and Fremantle datasets

4.6
4.4
4.2
4.0
3.8

3.6

1930 1940 1950 1960 1970 1980

Year

1.8

1.6

1.4

1.2

1900 1920 1940 1960 1980

Year


Daily closing prices of the Dow Jones Index

dowjones

11000
9000

Index
7000
5000
Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1
1995 1996 1997 1998 1999 2000 2001

Year


Log-daily returns of the Dow Jones Index

log.daily.return

0.04
0.02
0.00
−0.02

Index
−0.04
−0.06
Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1
1995 1996 1997 1998 1999 2000 2001

Year


Dow Jones Index data

dowjones log.daily.return

0.04

11000
0.02
0.00

9000
−0.02

Index
Index

7000
−0.04
−0.06

5000
Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1
1995 1999 1995 1999

Year Year


Windstorm loss data

• Windstorm losses of the Swedish insurance group
L¨nsf¨rs¨kringar during the period 1982 to 1993
a o a

• The database contains:
– The individual amounts of all claims
– The place and time of the claims
– The type of the claim

• 46 storm events, with a total claimed amount of
510 million Swedish crowns (MSEK)

• Farm insurance comprising of approximately 65% of
the total amount

• All values were corrected for inﬂation

• No adjustments for portfolio changes


Windstorm losses 1982-1993

Feb
92

Dec 88
4
n8
Ja
Ja
83

n
93
Jan

Questions:

• How can we predict the size of the next very severe
storm?

• How much reinsurance does a company need to
buy?


Windstorm losses which exceed the level
u = 0.9 MSEK, for 1982 – 1993

Jan 93
120
100
storm loss (in MSEK)
80
60

Jan 83
Jan 84
40

Dec 88
Feb 92
20
0

0 10 20 30 40
storm number


Australian temperature data

• A very large dataset on annual maximum and
minimum average daily temperatures at 224 stations
across Australia

Queensland
New South Wales
Victoria
South Australia
West Australia
Northern Territory
Tasmania


Annual maximum temperatures in
Victoria, Australia

• The maximum value, over all 34 weather stations
that were operating in the state of Victoria from
1910 to 1993, of annual temperatures (in degrees
Celsius) during this period.


39 33.6
33.5
35
33.4

ˆ
μ
33 33.3

Temperature
31 33.2

1920 1940 1960 1980 0.0 0.2 0.4 0.6 0.8 1.0

Year t

0.70 -0.1
0.65 -0.2
0.60

ˆ
σ
-0.3
ˆ
γ

0.55
-0.4
0.50

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

t t


Average annual maximum temperature

• The average annual maximum is derived by taking
the mean of maximum annual temperature readings
at 224 weather stations across Australia in the
period 1890–1993.

Queensland
New South Wales
Victoria
South Australia
West Australia
Northern Territory
Tasmania


Location and scale estimates with Gaussian ﬁt
29.0
28.5
28.5
28.0 28.0
27.5

ˆ
μ
27.0 27.5

Temperature
26.5
26.0 27.0

1900 1940 1980 0.0 0.2 0.4 0.6 0.8 1.0

Year t

-136.3
0.50 j
0.45
0.40 -136.5

ˆ
σ
0.35
C1

0.30
0.25 -136.7
0.0 0.2 0.4 0.6 0.8 1.0 0.12 0.16 0.20 0.24

t h


Another application to Australian
temperature data

• maximum annual values of average daily
temperature measurements at two meteorological
stations, Leonora (latitude 28.53, longitude 121.19)
and Menzies (latitude 29.42, longitude 121.02), in
Western Australia during the period 1898–1993.

27
Menzies

26

25

24

25 26 27 28

Leonora


Annual Maximum Wind Speeds in 1944-1983

80

70

60

50

Annual Maximum Wind Speed (konts) at Hartford (CT)
40 45 50 55 60 65

Annual Maximum Wind Speed (konts) at Albany (NY)


Concurrent measurements of wave and surge height in
south west England

0.8

0.6

0.4

0.2

Surge (m)
0.0

−0.2

0 2 4 6 8 10

Wave Height (m)


The framework
1. A proper mathematical model has to be chosen in
each case.
• parametric; best if the model is correct
• non parametric; can not be used for extrapolation
outside the observed values
• semi parametric; very ﬂexible (main subject of
this talk)

2. Parameters in each model have to be estimated
based on the historical data. Which method should
be used?

3. These estimates are our “best guesses” of the
process which is being analyzed. How to specify
uncertainty in the estimates?.

4. Goodness of ﬁt. Does the model give a good
representation of the historical data?

5. How can we reduce the uncertainties in our models?
How can extra information be incorporated in the
models?


Univariate Extreme Value Distributions

X1, X2, . . ., Xn, iid X ∼ F (x)

Mn = max(X1, X2, . . ., Xn), n ∈ N

an > 0 and bn ∈ R

Mn − bn
lim P ( ≤ x) = lim F n(anx + bn) = G(x)
n→∞ an n→∞

G(x) non-degenerate

F ∈ D(G)

F (x) belongs to domain of attraction of G(x)


Type I:

0 x<0
Φα(x) =
exp(−x−α) x ≥ 0

Type II:

exp(−(−xα)) x < 0
Ψα(x) =
1 x≥0

Type III:

Λ(x) = exp(−e−x) x∈R

Generalised Extreme Value Distribution

x−μ γ
1
G(x; γ, μ, σ) = exp{−(1 − γ )+ }
σ


Multivariate Extreme Value Distributions

(1) (d)
{Xn, n ≥ 1} = {(Xn , . . . , Xn ), n ≥ 1}

X ∼ F (x) iid

n n
(1) (d) (1) (d)
Mn = (Mn , . . . , Mn ) = ( Xj , . . . , Xj )
j=1 j=1

(i) (i)
σn > 0, un ∈ R

P [(Mn − u(i))/σn ≤ x(i), 1 ≤ i ≤ d] =
(i)
n
(i)

F n(σn x(1) + u(1), . . . , σn x(d) + u(d)) → G(x)
(1)
n
(d)
n

marginal Gi of G non-degenerate

F ∈ D(G)

F (x) belongs to domain of attraction of G(x)


Characterisation of Multivariate Extreme
Value Distributions

P [(Mn − u(i))/σn ≤ x(i), 1 ≤ i ≤ d] =
(i)
n
(i)

F n(σn x(1) + u(1), . . . , σn x(d) + u(d)) → G(x)
(1)
n
(d)
n

Definition. A df G in Rd is called max-stable if for
every t > 0

Gt(x) = G(α(1)(t)x(1)+β (1)(t), . . . , α(d)(t)x(d)+β (d)(t)).

Definition. A df G in Rd is called max-infinitely
divisible (max-id) if F t(x1, . . . , xd) is a df for every
t > 0.

G(∞, ∞, . . . , xi, . . . , ∞) = Φ1(xi) = exp(−x−1 )
i

G∗(x) is a MEVD with Φ1 marginals


Characterisation of Max-id and
Max-Stable Distributions
F max-id iﬀ for a Radon measure μ on
E := [k, ∞] {k}, k ∈ [−∞, ∞)d

exp{−μ[−∞, y]c} y ≥ k
F (y) =
0 otherwise

The measure μ is called an exponent measure.
G(∞, ∞, . . . , xi, . . . , ∞) = Φ1(xi) = exp(−x−1 )
i

G∗(x) is a MEVD with Φ1 marginals if for a ﬁnite
measure S on

ℵ = {y : y = 1}

d
a(i)
G∗(x) = exp − (i)
S(da)
ℵ i=1 x

a(i)S(da) = 1, 1 ≤ i ≤ d
ℵ


Bivariate Extreme Value Distributions

−μ∗ [0,(x,y)]c
G∗(x, y) = e

1 1 x
μ∗[0, (x, y)] = ( + )A(
c
)
x y x+y
1
A(w) = max{q(1 − w), (1 − q)w}S(dq)
0
A(w) is called dependence function.

1 1
qS(dq) = (1 − q)S(dq) = 1
0 0

• A(0) = A(1) = 1

• max{w, 1 − w} ≤ A(w) ≤ 1

• A(w) is convex for w ∈ [0, 1]


Some examples of the dependence function

1.0

0.9

0.8

A(w)
0.7

0.6

Mixed
Generalised mixed
0.5 Asym. mixed

0.0 0.2 0.4 0.6 0.8 1.0

w


Parametric Models for the Dependence
Function

1. The mixed model

1 1 θ
μ∗([0, (x, y)] ) = + −
c
, 0≤θ≤1
x y x+y

A(w) = θw2 − θw + 1, 0≤θ≤1
• θ = 0 gives independent case
• Complete dependence is not possible

2. The logistic model

μ∗([0, (x, y)]c) = (x−r + y −r )1/r , r≥1

A(w) = {(1 − w)r + wr }1/r , r≥1
• r = 1 gives independent case
• r = +∞ gives complete dependence


The Generalised Symmetric Mixed Model

1 1
c 1
μ∗([0, (x, y)] ) = + − k( p )1/p, (0 ≤ k ≤ 1, p ≥ 0)
x y x + yp

k
A(w) = 1 − 1
−p p
(1 − w) + w−p

• Independence for k = 0 or p = 0

• Complete dependence can be obtained with k = 1 and p = ∞ (Not
possible in the symmetric or asymmetric mixed model)


The Generalised Symmetric Logistic Model

1 1 k 1
c p,
μ∗([0, (x, y)] ) = ( p + p + p/2
) (0 < k ≤ 2(p − 1), p ≥ 2)
x y (xy)

1
p p
p 2
A(w) = (1 − w) + wp + k ((1 − w) w)

• k = 2 gives the symmetric logistic model

• Independence corresponds to p = 2 and k = 2

• Complete dependence for k = 2 and p = +∞


The Parameter Region for the Generalised Symmetric
Logistic Model

k

logistic model

2

equivalent models

0
2 4 6 p


The Asymmetric Mixed Model

c x3 + 3 x2 y − 2 φ x2 y − θ x2 y + 3 x y 2 − φ x y 2 − θ x y 2 + y 3
μ∗([0, (x, y)] ) =
x y (x + y)2

A(w) = φw3 + θw2 − (θ + φ)w + 1, (θ ≥ 0, θ + 2φ ≤ 1, θ + 3φ ≥ 0)

• Symmetric mixed model for φ = 0

• Independent case for θ = φ = 0 (Complete dependence is not possible)

• The parameter φ stands for non-symmetry in the model


The Asymmetric Logistic Model

φr xr +θ r y r 1 φr xr +θ r y r 1
(1 − φ) x + (1 − θ) y + x ( (x+y)r ) r + y ( (x+y)r ) r
c
μ∗([0, (x, y)] ) =
xy

A(w) = {(θ(1 − w))r + (φw)r }1/r + (θ − φ)w + 1 − θ, (0 ≤ θ, φ ≤ 1, r ≥ 1)

• For θ = φ = 1 this model reduces to the corresponding symmetric logistic
model which gives the diagonal case for r = +∞.

• Independence is obtained for θ = 0 and for φ = 0 or r = 1.


Estimation of the dependence function

• Nonparametric methods
1. Pickands estimator (1981)
2. Cap´ra`, Foug`res and Genest’s estimator (1997)
e a e

• Maximum likelihood based on parametirc models

New Nonparametric Methods:

1. Convex hull of modiﬁed Pickands estimator

2. Constrained smoothing splines


Pickands estimator

• Suppose (X, Y ) has a bivariate extreme value
distribution with exponential margins.

• min{X/(1 − w), Y /w} has an exponential
distribution with mean 1/A(w).

• the maximum likelihood estimator of A(w) is

n −1

An(w) = n min {Xi/(1 − w), Yi/w}
i=1

• For each 0 ≤ w ≤ 1, 1/An(w) is an unbiased and
strongly consistent estimator of 1/A(w).

• δn(w) = n1/2 1/An(w) − 1/A(w) satisﬁes the
central limit theorem in C(0, 1), B ; see Deheuvels,
P. (1991).


Pickands estimates for 100 simulated data from Logistic
model with r = 1.1 and r = 1.3

1.2

1.0
1.0

A(w)
A(w)
0.8
0.8

0.6 0.6

Pickands Pickands
Logistic, r = 1.1 Logistic, r = 1.3

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Pickands estimates for 100 simulated data from Logistic
model with r = 1.6 and r = 2

1.0
1.0

0.9
0.9

0.8 0.8

A(w)
A(w)
0.7 0.7

0.6 0.6

Pickands Pickands
0.5 Logistic, r = 1.6 0.5 Logistic, r = 2

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Cap´ra`, Foug`res and Genest’s
e a e
estimator

• Copula for a bivariate extreme value distribution
with marginals F (x) and G(y)

C(u, v) = P {F (x) ≤ u, G(y) ≤ v}
log(u)
= exp log(uv)A
log(uv)

• Ui, Vi ≡ {F (Xi), G(Yi)}(1 ≤ i ≤ n)

log(Ui)
• Pseudo-observations Zi = log(UiVi ) (1 ≤ i ≤ n)

• H(z) = P (Zi ≤ z) = z + z(1 − z)D(z) where
D(z) = A (z)/A(z) for all 0 ≤ z ≤ 1

t H(z)−z 1 H(z)−z
• A(t) = exp 0 z(1−z)
dz = exp − t z(1−z)
dz
t Hn (z)−z
1. A0 (t) = exp
n 0 z(1−z)
dz


1 Hn(z)−z
2. A1 (t) = exp −
n t z(1−z)
dz

• log An(t) = p(t) log A0 (t) + {1 − p(t)} log A1 (t)
n n

deﬁnition of the estimator:

Denote the ordered values of Zi by Z(1), . . . , Z(n) and
deﬁne

i 1/n

Qi = Z(k)/(1 − Z(k)) (1 ≤ i ≤ n).
k=1

Then An can be written as
⎧
⎪ (1 − t)Q1−p(t)
⎨ n 0 ≤ t ≤ Z(1)
1−p(t) −1
An(t) = ti/n(1 − t)1−i/nQn Qi Z(i) ≤ t ≤ Z(i+1)
⎪
⎩ −p(t)
tQn Z(n) ≤ t ≤ 1

• An(0) = An(1) = 1 if p(0) = 1 − p(1) = 1.


Cap´ra`’s estimates for 100 simulated data from Logistic
e a
model with r = 1.1 and r = 1.3

1.0
1.0

0.9
0.9

0.8 0.8

A(w)
A(w)
0.7 0.7

0.6 0.6

Caperaa Caperaa
0.5 Logistic, r = 1.1 0.5 Logistic, r = 1.3

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Cap´ra`’s estimates for 100 simulated data from Logistic
e a
model with r = 1.6 and r = 2

1.0 1.0

0.9 0.9

0.8 0.8

A(w)
A(w)
0.7 0.7

0.6 0.6

Caperaa Caperaa

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Modiﬁed Pickands estimator

(1) (2)
• Let Yi = (Yi , Yi ) for 1 ≤ i ≤ n be independent
and identically extreme value distributed random
variables with exponential margins.

¯
• Put Y ( ) = n−1
( )
Yi and Yi
( ) ( )¯
= Yi /Y ( )
for
i
= 1, 2.

(1) (2)
• B(u) ≡ n−1 i=1 min Yi /(1 − u), Yi /u is
n

uniformly root-n consistent for B(u) ≡ A(u)−1.

1. The estimator of the dependence function passes
through the points (0, 1) and (1, 1), and has
gradients −1 and 1 at these respective points.

ˆ
2. B(u) ≤ min{1/(1−u), 1/u} so A ≡ B −1 lies above
the lower boundary of the trianglur area.

˜ ˆ
3. The greatest convex minorant, A, of A satisﬁes all
necessary conditions for a dependence function.


Modiﬁed Pickands estimates for 100 simulated data
from Logistic model with r = 1.1 and r = 1.3

1.0 1.0

0.9 0.9

0.8 0.8

A(w)
A(w)
0.7 0.7

0.6 0.6

chull of modified Pickand chull of modified Pickand
Modified Pickands Modified Pickands

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Modiﬁed Pickands estimates for 100 simulated data
from Logistic model with r = 1.6 and r = 2

1.0 1.0

0.9 0.9

0.8 0.8

A(w)
A(w)
0.7 0.7

0.6 0.6

chull of modified Pickand chull of modified Pickand

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Constrained smoothing splines

ˆ
• A may be approximated by a spline that is
constrained to satisfy all the necessary conditions
on the dependence function.

• Choose regularly spaced points 0 = t0 < . . . <
tm = 1 in the interval [0, 1].

˜
• Given a smoothing parameter s > 0, take As to be
a polynomial smoothing spline of degree 3 or more
which minimises
m 1
ˆ ˜
{A(tj ) − As(tj )}2 + s ˜
As (t)2 dt ,
j=1 0

subject to
1. ˜ ˜
As(0) = As(1) = 1
2. ˜ ˜
As(0) ≥ −1 and As(1) ≤ 1
3. ˜
As ≥ 0 on [0, 1].


Smoothed spline of modiﬁed Pickands estimates for 100
simulated data from Logistic model with r = 1.1 and
r = 1.3

1.0 1.0

0.9 0.9

0.8 0.8

A(w)
0.7 A(w) 0.7

0.6 0.6

Smoothed spline Smoothed spline

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Smoothed spline of modiﬁed Pickands estimates for 100
simulated data from Logistic model with r = 1.6 and
r=2

1.0 1.0

0.9 0.9

0.8 0.8

A(w)
0.7 A(w) 0.7

0.6 0.6

Smoothed spline Smoothed spline

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

w w


Which model to use in practice?

• maximum likelihood of parametric models, e.g.
1. symmetric mixed model
2. symmetric logistic model
3. asymmetric mixed model
4. asymmetric logistic model
5. generalised symmetric logistic model
6. generalised asymmetric mixed model

• Nonparametirc methods including
1. the Pickands (1981, 1989) estimator
2. the convex hull of Pickands’ estimator
3. the estimator proposed by Cap´ra`, Foug`res and
e a e
Genest (1997)
4. the convex hull of the latter
5. our modiﬁcation of Pickands’ estimator
6. the convex hull of the latter

• constrained smoothing splines ﬁtted to any of these
nonparametric estimators


1.0

0.9

0.8

A(w)
Smoothed spline
Logistic
Mixed
0.7 Generalised logistic
Generalised mixed
Asym. logistic
Asym. mixed
0.6 Pickands
chull of modified Pickand
Caperaa
Modified Pickands
0.5 Logistic, r = 1.1

0.0 0.2 0.4 0.6 0.8 1.0

w

August 6, 2010

1.0

0.9

0.8

A(w)
Smoothed spline
Logistic
Mixed
0.7 Generalised logistic
Generalised mixed
Asym. logistic
Asym. mixed
0.6 Pickands
Caperaa
Modified Pickands

0.0 0.2 0.4 0.6 0.8 1.0

w

August 6, 2010

1.0

0.9

0.8

A(w)
Smoothed spline
Logistic
0.7 Mixed
Generalised logistic
Generalised mixed
Asym. logistic
Asym. mixed
0.6 Pickands
Caperaa
Modified Pickands

0.0 0.2 0.4 0.6 0.8 1.0

w

August 6, 2010

1.0

0.9

0.8

A(w)
Smoothed spline
Logistic
0.7 Mixed
Generalised mixed
Asym. logistic
Asym. mixed
0.6 Pickands
Caperaa
Modified Pickands
0.5 Logistic, r = 2

0.0 0.2 0.4 0.6 0.8 1.0

w

August 6, 2010

Monte Carlo approximations to mean integrated squared
errors, multiplied by 105

n = 25 n = 50 n = 100
method r=1 r=2 r=3 r=1 r=2 r=3 r=1 r=2 r=3
logistic 197 64 14 110 34 8 42 14 4
Pickands 5614 2829 3331 2034 1547 1261 1172 712 567
convex hull of Pickands 7229 2611 2775 2588 1388 1049 1430 671 477
Cap´ra` et. al.
e a 889 102 35 568 49 20 307 29 10
convex hull of Cap´ra` et. al.
e a 1188 95 41 666 57 25 373 32 12
modified Pickands 1351 138 33 614 77 18 366 37 11
convex hull of modified Pickands 1861 139 46 815 70 24 453 38 15
smoothed spline of
Pickands 784 919 1020 396 728 487 215 334 220
convex hull of Pickands 525 769 1055 282 637 490 135 327 230
Cap´ra` et. al.
e a 303 82 21 177 37 12 97 24 8
convex hull of Cap´ra` et. al.
e a 286 66 22 167 39 14 97 26 11
modified Pickands 447 104 21 240 62 12 130 31 9
convex hull of modified Pickands 401 73 24 232 49 16 107 30 15


Application to Australian temperature
data

• maximum annual values of average daily
temperature measurements at two meteorological
stations, Leonora (latitude 28.53, longitude 121.19)
and Menzies (latitude 29.42, longitude 121.02), in
Western Australia during the period 1898–1993.

27
Menzies

26

25

24

25 26 27 28

Leonora


Logistic models for the dependence function ﬁtted by
maximum likelihood to the temperature data

1.0

0.9

0.8

A(w)
0.7

0.6
Logistic
Asym. logistic
0.5 Modified Pickands

0.0 0.2 0.4 0.6 0.8 1.0

w


Mixed models for the dependence function ﬁtted by
maximum likelihood to the temperature data

1.0

0.9

0.8

A(w)
0.7

0.6
Mixed
Generalised mixed
Asym. mixed

0.0 0.2 0.4 0.6 0.8 1.0

w


Estimating a bivariate extreme-value
distribution function
• Let X = (X (1), X (2)) have a bivariate extreme-
value distribution F .

• There exist monotone increasing transformations
Tj = Tj (·|θj ) such that (T1(X (1)), T2(X (2))) has
distribution function G0.
(1) (2)
• Given a sample {Xi = (Xi , Xi ), 1 ≤ i ≤ n},
ˆ
compute a root-n consistent estimator θj of θj from
(j)
the marginal data Xi , 1 ≤ i ≤ n.

ˆ
• Put Tj = Tj (·|θj ) and

n
( ) ( ) ( )
Yj = T (Xj ) n−1 T (Xi ) .
i=1

ˆ ˆ ˆ
• F x(1), x(2)) = G0 T1 x(1) θ1 , T2 x(2) θ2 is
root-n consistent for F .


Distribution function estimate and semi-inﬁnite
prediction regions corresponding to nominal levels
α = 0.9, 0.95 and 0.99

(a)

28.5

28.0

0.99

fs Menzies 27.5 0.95
0.9

27.0

26.5
Menzies
Leonora
27.0 27.5 28.0 28.5 29.0 29.5 30.0

Leonora


Compact bivariate prediction regions
Construct compact prediction regions by proﬁling the estimator

˜ ∂2 ˆ ˆ
fs(x) = G1 t(1), t(2)
∂x(1) ∂x(2)

= ˆ ˆ ˆ ˆ
T1 x(1) θ1) T2 x(2) θ2) G1 t(1), t(2)
ˆ
t(2) ˆ
t(1) ˆ
t(2)
× ˜
As (1) + (1) ˜
A
ˆ
t + t(2)ˆ ˆ
t +t ˆ ˆ ˆ
(2) s t(1) + t(2)

ˆ
t(2) ˆ
t(2) ˆ
t(2)
× ˜s
A (1) − (1) ˜s
A (1)
ˆ
t + t(2)ˆ t
ˆ + t(2)ˆ ˆ
t + t(2)ˆ
ˆ ˆ
t(1) t(2) ˆ
t(2)
+ A ˜
ˆ
(t ˆ
(1) + t(2) )3 s ˆ ˆ
t(1) + t(2)

of the density, f , of X.


How to choose s

• CV (s) = ˜
fs(x)2 dx − 2 n−1
n ˜
f−i,s(Xi).
i=1

• CV (s) is an almost-unbiased approximation to
˜2 ˜
E(fs − 2fsf ).

• The value of s that results from minimising CV (s)
˜
will asymptotically minimise E(fs − f )2.

• To construct prediction regions, deﬁne

˜
R(u) ≡ x : fs(x) ≥ u , β(u) = ˜
fs(x) dx .
e
R(u)

• Given a prediction level α, let u = uα denote the
˜
solution of β(u) = α. Then, R(˜α) is a nominal
u
α-level prediction region for a future value of X.


Cross-validation criterion CV (s) and spline-smoothed
˜
dependence function estimate As for s = 0.05, with the
unsmoothed, modiﬁed Pickands estimate

(a)

-0.228 1.0

-0.230
0.9

-0.232

-0.234 0.8

A(w)

CV
-0.236
0.7

-0.238

-0.240 0.6

Smoothed spline
-0.242 chull of modified Pickand

0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.2 0.4 0.6 0.8 1.0

s w


Plot of spline-smoothed density estimate
˜
fs

fs

Menzies
Leonora


Compact bootstrap calibrated prediction regions with
nominal levels α = 0.85 and 0.90

28.0
27.5
27.0
26.5
Menzies
26.0
25.5
0.85

25.0
0.90

26 27 28 29
Leonora


Bootstrap calibration

¯ ˜ ˜
• Take A = A or Aλ in

¯ 1 1 ¯ x(1)
F (x) = exp − (1) + (2) A (1) .
x x x + x(2)

¯
• Compute the chosen region Rα, with nominal
coverage α, from the data X = {X1, . . . , Xn}.

¯
• By resampling from F conditional on X , compute
a new dataset X ∗ = {X1 , . . . , Xn}, and from it
∗ ∗
¯ ¯
calculate the analogue F ∗ of F , and then the
¯α ¯
analogue R∗ of Rα.

• Let γ(α) equal the probability, conditional on the
¯
data X , that a random 2-vector drawn from F lies
¯α
in R∗ .

• Let a = a(α) be the solution of γ(a) = α. Then,
ˆ
¯ˆ ¯
Ra(α) is the bootstrap-calibrated form of Rα.


Theoretical Properties

ˆ ˜
• A and its greatest convex minorant, A, are uniformly root-n consistent
for A:

ˆ ˜
sup |A(u) − A(u)| + sup |A(u) − A(u)| = Op n−1/2 .
0≤u≤1 0≤u≤1

• if the distribution H of Y (1)/(Y (1) + Y (2)) has a bounded density then,
for each ∈ (0, 1 ],
2

sup ˆ
|A (u) − A (u)| + sup ˜
|A (u) − A (u)| = Op n−1/2
≤u≤1− ≤u≤1−

ˆ ˆ ˜ ˜
• if A has three bounded derivatives then the biases of A, A , A, A are
O(n−1).


Shape constrained smoothing using
smoothing splines
Given data {(ti, yi)}, ti ∈ [a, b] for i = 1, . . . , n, what
ˆ
is the behaviour of the solution g of the following
minimisation problem?

n b 2
2 (m)
minimise yi − g(ti ) +λ g (u) du,
i=1 a
(1a)
where g (r)(t) ≥ 0 t ∈ [a, b]. (1b)

References

[1] Mammen, E. and Thomas-Agnan, C. (1999),
Smoothing splines and shape restrictions,
Scandinavian Journal of Statististics, 26, 239–
252.


Proposed Estimator for m = 2 and r ≤ 2

For m = 2, the piecewise polynomial representation of a natural cubic
C 2-spline g is:

n
g(t) = I[ti,ti+1)(t)Si(t), (2a)
i=0

where Si(t) = ai + bi(t − ti) + ci(t − ti)2 + di(t − ti)3, 1 ≤ i ≤ n − 1,
(2b)
S0(t) = a1 + b1(t − t1) and Sn(t) = Sn−1(tn) + Sn−1(tn)(t − tn).

The coeﬃcients in (2b) have to fulﬁll the following equations for g to be a


natural cubic C 2-spline:

Si−1(ti) = Si(ti) for i = 1, . . . , n
Si−1(ti) = Si(ti) for i = 1, . . . , n (3)
Si−1(ti) = Si (ti) for i = 1, . . . , n

A direct implementation would lead to an unnecessarily large quadratic
programming problem and we propose to use the value-second derivative
representation (see Green and Silverman, 1994, chapter2)for the actual
implementation.
For i = 1, . . . , n, deﬁne gi = g(ti) and γi = g (ti). By deﬁnition, a natural
cubic C 2-spline has γ1 = γn = 0. Let g denote the vector (g1, . . . , gn)T
and γ = (γ2, . . . , γn−1 )T . Note that for notational simplicity later on the
entries of γ are numbered in a non-standard way, starting at i = 2. The
vectors g and γ specify the natural cubic spline g completely.


However, not all possible vectors g and γ represent natural cubic splines.
To derive sufficient (and necessary) conditions for g and γ to represent a
cubic spline we define the following matrices Q and R. Define
hi = ti+1 − ti for i = 1, . . . , n − 1. Let Q be the n × (n − 2) matrix with
entries qi,j , for i = 1, . . . , n and j = 2, . . . , n − 1, given by

qj−1,j = h−1 ,
j−1 qj,j = −h−1 − h−1 ,
j−1 j and qj,j+1 = h−1 ,
j

for j = 2, . . . , n − 1, and qi,j = 0 for |i − j| ≥ 2. Note, that the columns
of Q are numbered in the same non-standard way as the entries of γ.
The (n − 2) × (n − 2) matrix R is symmetric with elements {ri,j }n−1
i,j=2
given by

ri,i = 1 (hi−1 + hi) for i = 2, · · · , n − 1,
3
ri,i+1 = ri+1,i = 1 hi
6 for i = 2, · · · , n − 2,


and ri,j = 0 for |i − j| ≥ 2. Note, that R is strictly diagonal dominant
and, hence, it follows from standard arguments in numerical linear algebra,
that R is strictly positive-definite.
We are now able to state the following key result.
Proposition. The vectors g and γ specify a natural cubic spline g if and
only if the condition

QT g = Rγ (4)

is satisfied. If (4) is satisfied then we have
b
2
{g (t)} dt = γ T Rγ. (5)
a

For a proof see Green and Silverman (1994, section 2.5).


This result allows us to state problem (1a) as a quadratic programming
problem. Let y denote the (2n − 2)-vector (y1, . . . , yn, 0, . . . , 0)T , g the
T
(2n − 2)-vector g T , γ T , A the (2n − 2) × (n − 2)-matrix Q −RT ,
In the n × n unit matrix and

In 0
D= . (6)
0 λR

Then the solution of (1a) is given by the solution of the following
quadratic program:

minimise − yT g + 1 gT Dg,
2 (7a)
where AT g = 0. (7b)

We propose to use the algorithm of Goldfrab and Idnani (1982, 1983) to
solve (7).


Statistical Modeling of Extreme Values

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Mehr von Kanda Runapongsa Saikaew

Mehr von Kanda Runapongsa Saikaew (20)

Statistical Modeling of Extreme Values