Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Introduction to Global Sensitivity - Clémentine Prieur, Aug 28, 2017
This document provides an introduction to global sensitivity analysis. It discusses how sensitivity analysis can quantify the sensitivity of a model output to variations in its input parameters. It introduces Sobol' sensitivity indices, which measure the contribution of each input parameter to the variance of the model output. The document outlines how Sobol' indices are defined based on decomposing the model output variance into terms related to individual input parameters and their interactions. It notes that Sobol' indices are generally estimated using Monte Carlo-type sampling approaches due to the high-dimensional integrals involved in their exact calculation.
Ähnlich wie Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Introduction to Global Sensitivity - Clémentine Prieur, Aug 28, 2017
Normal density and discreminant analysisVARUN KUMAR
Ähnlich wie Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Introduction to Global Sensitivity - Clémentine Prieur, Aug 28, 2017 (20)
Z Score,T Score, Percential Rank and Box Plot Graph
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics Opening Workshop, Introduction to Global Sensitivity - Clémentine Prieur, Aug 28, 2017
1. Introduction to Global Sensitivity Analysis
Clémentine PRIEUR
Grenoble Alpes University
Opening Workshop: August 28 – September 1, 2017
Program on Quasi-Monte Carlo and High-Dimensional Sampling
Methods for Applied Mathematics
1/ 61
2. Introduction
inputs −→ model −→ output
X1
·
·
·
Xd
−→ M −→ Y = M(X1, . . . , Xd )
One wishes to quantify the sensitivity of the output Y to the
inputs X1, . . . , Xd .
The model M is most of the time complex, expensive to evaluate.
Each input factor can be a scalar, a vector, or even a function.
2/ 61
3. Introduction
Application to a biogeochemical model:
ecosystem model (MODECOGeL) of the Ligurian Sea
Joint work with IGE Lab (Grenoble, FRANCE)
3/ 61
4. Introduction
MODECOGeL is a one-dimensional coupled hydrodynamical-
biological model.
• hydrodynamic model: 1-D vertical simplification of primitive
equations for the ocean, 5 state variables;
• ecosystem model: marine biogeochemistry, 12 biological state
variables.
4/ 61
5. Introduction
Inputs/Outputs:
87 scalar input parameters;
spatio-temporal outputs.
Main issue: calibration of the model.
Sensitivity Analysis is a preliminary step to this calibration task.
5/ 61
6. Introduction
Agro-climatic model for the water status management of vineyard
Joint work with INRA and iTK (Montpellier, FRANCE)
Project objective: control of grape/wine quality. SA as decision
support.
6/ 61
7. Introduction
Application to an integrated
land use and transport model, TRANUS
Joint work with Inria Project/Team STEEP (Grenoble, FRANCE),
ANR CITIES (2013-2016)
SA at two stages: first for calibration, then
for testing the sensitivity to various ecologi-
cal scenarios.
7/ 61
8. Introduction
Application to the boundary control of an open channel
Joint work with GIPSA-lab (Grenoble, FRANCE)
and University Paris XI (Orsay, FRANCE)
SA is performed in an intrusive way (Shallow Water) with model
order reduction.
8/ 61
9. Outline
I- Sensitivity Analysis, the framework.
II- Estimation procedure.
III- Recursive estimation procedure.
IV- A few open questions.
9/ 61
11. I- Sensitivity Analysis, the framework
Context:
M :
Rd → R
x → y = M(x1, . . . , xd )
Aim: to determine the way the output of the model reacts to
variations of the input parameters.
10/ 61
12. I- Sensitivity Analysis, the framework
Context:
M :
Rd → R
x → y = M(x1, . . . , xd )
Aim: to determine the way the output of the model reacts to
variations of the input parameters.
Several analyses are possible:
qualitative analyses: are there non linear effects? interactions?
screening approaches.
quantitative analyses: factors’ hierarchization, statistical
hypothesis testing: e.g., H0 "the ith factor has no influence on the
output". sensitivity measures.
10/ 61
13. I- Sensitivity Analysis, the framework
Several ways to handle quantitative SA:
Local approaches:
M(x) ≈ M(x0) + d
i=1
∂M
∂xi x0
(xi − x0
i ) (Taylor’s approx.).
First-order sensitivity measure for the ith input: ∂M
∂xi x0
.
Pros: if an adjoint is available, the cost is independent of d.
Cons: local approach, non appropriate for highly non-linear models.
11/ 61
14. I- Sensitivity Analysis, the framework
Global approaches:
The uncertainty on the inputs is modeled by a probability
distribution, from experts’ knowledge, or from observations, . . .
e.g., if the inputs are independent, this probability distribution is
characterized by its marginals.
Figure: unimodal distribution (left), bimodal distribution (right)
Figure: Local versus Global (G := M), illustration.
12/ 61
15. I- Sensitivity Analysis, the framework
"Globalized" local approaches: e.g., (1) EX
∂M
∂xi X
, or
(2) EX
∂M
∂xi X
2
.
13/ 61
16. I- Sensitivity Analysis, the framework
"Globalized" local approaches: e.g., (1) EX
∂M
∂xi X
, or
(2) EX
∂M
∂xi X
2
.
Pros: it takes into account the inputs’ distribution, the cost is
independent of the dimension in case an adjoint is available .
13/ 61
17. I- Sensitivity Analysis, the framework
"Globalized" local approaches: e.g., (1) EX
∂M
∂xi X
, or
(2) EX
∂M
∂xi X
2
.
Pros: it takes into account the inputs’ distribution, the cost is
independent of the dimension in case an adjoint is available .
Cons:
(1) & (2) are not enough discriminant
(2) is known as Derivative-based Global Sensitivity Measures , see
Sobol’ & Gresham (1995), Sobol’ & Kucherenko (2009). This
index is more appropriate for screening than for hierarchization (see
Lamboni et al., 2013). 13/ 61
18. I- Sensitivity Analysis, the framework
In this talk we focus on global approaches, mainly Sobol’ indices,
for a hierarchization of input parameters.
In the following, inputs are assumed to be independent.
14/ 61
19. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ?
15/ 61
20. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ? ⇒ E [V (Y |Xi )]
15/ 61
21. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ? ⇒ E [V (Y |Xi )]
The more this quantity is small, the more fixing Xi reduces the
variance of Y : the input Xi is influent.
15/ 61
22. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ? ⇒ E [V (Y |Xi )]
The more this quantity is small, the more fixing Xi reduces the
variance of Y : the input Xi is influent.
Theorem (total variance)
Var(Y ) = Var [E (Y |Xi )] + E [Var (Y |Xi )].
15/ 61
23. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ? ⇒ E [V (Y |Xi )]
The more this quantity is small, the more fixing Xi reduces the
variance of Y : the input Xi is influent.
Theorem (total variance)
Var(Y ) = Var [E (Y |Xi )] + E [Var (Y |Xi )].
Definition (First order Sobol’ indices)
i = 1, . . . , d 0 ≤ Si =
V [E (Y |Xi )]
Var(Y )
≤ 1
15/ 61
24. I- Sensitivity Analysis, the framework
Towards Sobol’ sensitivity indices :
Does the output Y vary more or less when fixing one of its inputs?
Var (Y |Xi = xi ), how to choose xi ? ⇒ E [V (Y |Xi )]
The more this quantity is small, the more fixing Xi reduces the
variance of Y : the input Xi is influent.
Theorem (total variance)
Var(Y ) = Var [E (Y |Xi )] + E [Var (Y |Xi )].
Definition (First order Sobol’ indices)
i = 1, . . . , d 0 ≤ Si =
V [E (Y |Xi )]
Var(Y )
≤ 1
ex. : if Y = d
i=1 βi Xi , with the Xi independent, one gets
Si =
β2
i Var(Xi )
Var(Y ) = ρ2
i , with ρi the linear correlation coefficient.
15/ 61
25. I- Sensitivity Analysis, the framework
More generally,
Theorem (Hoeffding Sobol’ decomposition
[Hoeffding, 1948, Sobol’, 1993])
M : [0, 1]d → R, [0,1]d M2(x)dx < ∞
M admits a unique decomposition of the form
M0 + d
i=1 Mi (xi ) + 1≤i<j≤d Mi,j(xi , xj) + . . . +
M1,...,d (x1, . . . , xd )
under the constraints
M0 constant,
∀ 1 ≤ s ≤ d, ∀ 1 ≤ i1 < . . . < is ≤ d, ∀ 1 ≤ p ≤ s
1
0
Mi1,...,is (xi1 , . . . , xis )dxip = 0
16/ 61
26. I- Sensitivity Analysis, the framework
Consequences : M0 = [0,1]d M(x)dx and the terms in the
decomposition are orthogonal.
The computation of the terms of the decomposition reduces to:
Mi (xi ) = [0,1]d−1 M(x) Πp=i dxp − M0
17/ 61
27. I- Sensitivity Analysis, the framework
Consequences : M0 = [0,1]d M(x)dx and the terms in the
decomposition are orthogonal.
The computation of the terms of the decomposition reduces to:
Mi (xi ) = [0,1]d−1 M(x) Πp=i dxp − M0
i = j Mi,j(xi , xj) =
[0,1]d−2 M(x) Πp=i,jdxp − M0 − Mi (xi ) − Mj(xj)
17/ 61
28. I- Sensitivity Analysis, the framework
Consequences : M0 = [0,1]d M(x)dx and the terms in the
decomposition are orthogonal.
The computation of the terms of the decomposition reduces to:
Mi (xi ) = [0,1]d−1 M(x) Πp=i dxp − M0
i = j Mi,j(xi , xj) =
[0,1]d−2 M(x) Πp=i,jdxp − M0 − Mi (xi ) − Mj(xj)
. . .
17/ 61
29. I- Sensitivity Analysis, the framework
Consequences : M0 = [0,1]d M(x)dx and the terms in the
decomposition are orthogonal.
The computation of the terms of the decomposition reduces to:
Mi (xi ) = [0,1]d−1 M(x) Πp=i dxp − M0
i = j Mi,j(xi , xj) =
[0,1]d−2 M(x) Πp=i,jdxp − M0 − Mi (xi ) − Mj(xj)
. . .
Computation of multiple integrals.
17/ 61
30. I- Sensitivity Analysis, the framework
Variance decomposition : X1, . . . , Xd i.i.d.
Y = M(X) = M0 + d
i=1 Mi (Xi ) + . . . + M1,...,d (X1, . . . , Xd )
M0 = E(Y ),
Mi (Xi ) = E (Y |Xi ) − E(Y ),
i = j
Mi,j(Xi , Xj) = E (Y |Xi , Xj) − E (Y |Xi ) − E (Y |Xj) + E(Y ),
. . .
18/ 61
31. I- Sensitivity Analysis, the framework
Variance decomposition : X1, . . . , Xd i.i.d.
Y = M(X) = M0 + d
i=1 Mi (Xi ) + . . . + M1,...,d (X1, . . . , Xd )
M0 = E(Y ),
Mi (Xi ) = E (Y |Xi ) − E(Y ),
i = j
Mi,j(Xi , Xj) = E (Y |Xi , Xj) − E (Y |Xi ) − E (Y |Xj) + E(Y ),
. . .
Var(Y ) = d
i=1 Var (Mi (Xi )) + . . . + Var (M1,...,d (X1, . . . , Xd ))
18/ 61
33. I- Sensitivity Analysis, the framework
Definition (Sobol’ indices)
∀ i = 1, . . . , d Si = Var(Mi (Xi ))
Var(Y ) = Var[E(Y |Xi )]
Var(Y )
∀ i = j
Si,j =
Var(Mi,j (Xi ,Xj ))
Var(Y ) =
Var[E(Y |Xi ,Xj )]−Var[E(Y |Xi )]−Var[E(Y |Xj )]
Var(Y )
. . .
1 =
d
i=1
Si +
i=j
Si,j + . . . + S1,...,d
Definition (Total indices)
i = 1, . . . , d STi =
u⊂{1,...,d} , u=∅ , i∈u
Su .
19/ 61
34. I- Sensitivity Analysis, the framework
Sobol’ indices :
Definition (Total indices)
i = 1, . . . , d STi =
u⊆{1,...,d} , u=∅ , i∈u
Su .
X(−i) = (X1, . . . , Xi−1, Xi+1, . . . , Xd )
We then get,
STi =
E Var Y |X(−i)
Var(Y )
= 1 −
Var E Y |X(−i)
Var(Y )
.
20/ 61
35. I- Sensitivity Analysis, the framework
Indices by input :
A CB
A.B B.CA.C
A.B.C
Effets
principaux
Interactions
2 facteurs
Interactions
3 facteurs
A CB
A.B B.CA.C
A.B.C
Effets
principaux
Interactions
2 facteurs
Interactions
3 facteurs
Effet total de A
21/ 61
37. I- Sensitivity Analysis, the framework
Remark : The analytical expression of Sobol’ indices involves high
dimensional multiple integrals, and is rarely available.
23/ 61
38. I- Sensitivity Analysis, the framework
Remark : The analytical expression of Sobol’ indices involves high
dimensional multiple integrals, and is rarely available.
Mainly two types of approaches for the estimation:
1) Monte-Carlo type approaches (L2 assumption on the model)
(see further);
2) spectral approaches (supplementary regularity assumptions on
the model).
23/ 61
39. I- Sensitivity Analysis, the framework
Remark : The analytical expression of Sobol’ indices involves high
dimensional multiple integrals, and is rarely available.
Mainly two types of approaches for the estimation:
1) Monte-Carlo type approaches (L2 assumption on the model)
(see further);
2) spectral approaches (supplementary regularity assumptions on
the model).
In case each evaluation of the model is time or memory consuming,
we may first fit a metamodel before applying estimation
procedures. The error due to this step has then to be studied.
ex.: parametric and non-parametric regression, kriging, reduced
basis approaches . . .
23/ 61
40. I- Sensitivity Analysis, the framework
For dependent inputs, several alternatives are proposed in the
litterature:
1) handle independent groups of dependent inputs,
2) reparameterize the problem (link with active subsets),
3) relax the constraints on the functional decomposition
(hierarchical Hoeffding decomposition).
The sensitivity indices defined from alternative 3) are difficult to
interpret.
Shapley (1953) value can be used to quantify the contribution of
members to a team. It is possible to define Shapley effects, whose
interpretation seems easier for dependent inputs.
24/ 61
41. II- Estimation procedure
Let X1 and X2 be two independent copies of X.
For i = 1, . . . , d, we define:
X2
{i} = (X2
1 , . . ., X2
i−1, X1
i , X2
i+1, . . ., X2
d )
Let Y = M(X1) and, for i = 1, . . . , d, Y{i} = M(X2
{i}).
If the random vector X has independent components, then we
deduce:
Si =
Cov Y , Y{i}
Var (Y )
·
25/ 61
42. II- Estimation procedure
For any i ∈ {1, . . . , d}, let Xj,1
i and Xj,2
i , j = 1, . . . , n be two
independent samples of size n of the parameter Xi .
We define:
Xj,1 = (Xj,1
1 , . . ., Xj,1
i−1, Xj,1
i , Xj,1
i+1, . . ., Xj,1
d ) j = 1, . . . , n
Xj,2
{i} = (Xj,2
1 , . . ., Xj,2
i−1, Xj,1
i , Xj,2
i+1, . . ., Xj,2
d ) j = 1, . . . , n , i = 1, . . . , d
We evaluate the model (1 + d)n times:
Y j,1 = M(Xj,1) j = 1, . . . , n
Y j,2
{i} = M(Xj,2
{i}) j = 1, . . . , n , i = 1, . . . , d.
26/ 61
43. II- Estimation procedure
Monte Carlo estimator: [Monod et al., 2006, Janon et al., 2014]
ˆS{i},n =
1
n
n
j=1
Y j,1
Y j,2
{i} −
1
n
n
j=1
Y j,1
+ Y j,2
{i}
2
2
1
n
n
j=1
(Y j,1
)2
+ (Y j,2
{i})2
2
−
1
n
n
j=1
Y j,1
+ Y j,2
{i}
2
2
Total indices and indices for higher order interactions can also be
estimated.
Main issue: the cost is prohibitive.
• (1 + d)n model evaluations for all first-order Sobol’ indices;
• ( d
2 + 1)n for all second-order Sobol’ indices.
The best with that approach: (2d + 2)n model evaluations for
double estimates for all first-order, second-order and total Sobol’
indices (see [Saltelli, 2002], with combinatorial tricks).
27/ 61
44. II- Estimation procedure
Example with d = 2 and n = 4 :
- On the left hand side Xj,1 ( ) and Xj,2
{1} (•), j = 1, 2, 3, 4.
- On the right hand side Xj,1 ( ) and Xj,2
{2} (•), j = 1, 2, 3, 4.
28/ 61
45. II- Estimation procedure
Which design of experiments to overcome this issue?
Let D a design of experiments (DoE) of size n defined by
D = xj
= (xj
1, . . . , xj
d ), 1 ≤ j ≤ n ·
The DoE D is replicated from D if there exist d independent
random permutations of {1, . . . , n} — denoted by π1,. . . , πd —
such that
D = x
j
= (x
π1(j)
1 , . . . , x
πd (j)
d ), 1 ≤ j ≤ n ·
29/ 61
46. II- Estimation procedure
Then D ∪ D can be used for estimating any first-order Sobol
indices using the pick & freeze approach (see Figure below).
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
D (initial DoE) D’ (replicated DoE)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
D (initial DoE) D’ (replicated DoE)
On the left hand side D is an independent sampling.
On the right hand side D is a LHS (thus D too).
30/ 61
47. II- Estimation procedure
Replicated Latin Hypercube sampling [McKay, 1995]
j = 1, . . . , n
xj =
j−U1,j
n , . . . ,
j−Ud,j
n
x j
=
π1(j)−U1,π1(j)
n , . . . ,
πd (j)−Ud,πd (j)
n
For the estimation of first order Sobol’ indices we take
Xj,1 = xj j = 1, . . . , n
Xj,2
{i} = x π−1
i (j)
j = 1, . . . , n.
31/ 61
48. II- Estimation procedure
Replicated Latin Hypercube sampling [McKay, 1995]
j = 1, . . . , n
xj =
j−U1,j
n , . . . ,
j−Ud,j
n
x j
=
π1(j)−U1,π1(j)
n , . . . ,
πd (j)−Ud,πd (j)
n
For the estimation of first order Sobol’ indices we take
Xj,1 = xj j = 1, . . . , n
Xj,2
{i} = x π−1
i (j)
j = 1, . . . , n.
Xj,2
{i}, j = 1, . . . , n, as a non ordered set of points, does not depend
on i, and that’s the trick
[Mara and Rakoto, 2008, Tissot and Prieur, 2015]. 31/ 61
49. II- Estimation procedure
Extension of the approach for the estimation of second-order
indices [Tissot and Prieur, 2015]
Definition (t–(q, d, λ) Orthogonal Array)
An OA in dimension d, with q levels, strength t ≤ d and index λ is
a matrix with n = λqt rows and d columns such that in every
n-by-t submatrix each of the qt possible rows occurs exactly the
same number λ of times.
Let (Aj
i )j=1..n,i=1..d be an OA in dimension d, with n points and q
levels, denoted by {1, . . . , q}.
Let Πq be the set of all the permutations of {1, . . . , q}.
32/ 61
50. II- Estimation procedure
Definition (Randomized OA, [Owen, 1992])
(xj)j=1..n is a a randomized Orthogonal Array if
xj
=
π1(Aj
1) − U1,π1(Aj
1)
q
, . . . ,
πd (Aj
d ) − Ud,πd (Aj
d
)
q
with (Aj)j=1..n an OA and where the πi ’s and the Ui,j’s are
independent random variables uniformly distributed on Πq and
[0, 1], respectively.
The structure of a t–(q, d, λ) rand. OA is invariant by replication.
With a similar trick as for first order indices, we can estimate all
second-order indices with
D(n) = {xj, 1 ≤ j ≤ n} ∪ {x j
, 1 ≤ j ≤ n}, where (xj)j=1..n and
(x j
)j=1..n two 2–(q, d, λ) replicated rand. OA.
33/ 61
51. II- Estimation procedure
The best we can do?
• Using a 2–(q, d, 1) OA, 2n model evaluations for all second-order
Sobol’ indices and q × q ! estimates of each first-order Sobol’ index
[Gilquin et al., 2017a]
• Using a 2–(q, d, 1) OA and combinatorial tricks, (d + 2)n model
evaluations for all total indices, all second-order Sobol’ indices and
q × q ! estimates of each first-order Sobol’ index
[Saltelli, 2002, Gilquin et al., 2017a]
34/ 61
52. II-Estimation procedure
Wing weight function [Forrester et al., 2008]:
f (x) = 0.036x0.758
1 x0.0035
2
x3
cos(x4)2
0.6
x0.006
5 x0.04
6 ×
×
100x7
cos(x4)
−0.3
(x8x9)0.49
+ x1x10.
Parameters of the wing weight function. All inputs are independent and
uniformly distributed in their resp. ranges. n = 712
= 5 041.
parameters range description
x1 [150, 200] wing area (ft2
)
x2 [220, 300] weight of fuel in the wing (lb)
x3 [6, 10] aspect ratio
x4 [−10, 10] quarter-chord sweep (degrees)
x5 [16, 45] dynamic pressure at cruise (lb/ft2
)
x6 [0.5, 1] taper ratio
x7 [0.08, 0.18] aerofoil thickness to chord ratio
x8 [2.5, 6] ultimate load factor
x9 [1700, 2500] flight design gross weight (lb)
x10 [0.025, 0.08] paint weight (lb/ft2
)
35/ 61
53. II-Estimation procedure
First-order Sobol’ index estimates and 95 % bootstrap CI.
input Su bootstrap CI
x1 0.1252 [0.0982, 0.1521]
x2 0.0039 [−0.0238, 0.0315]
x3 0.2187 [0.1930, 0.2445]
x4 0.0037 [−0.0235, 0.0313]
x5 −0.0002 [−0.028, 0.0273]
x6 −0.0014 [−0.028, 0.0263]
x7 0.1418 [0.1151, 0.1687]
x8 0.4124 [0.3907, 0.4337]
x9 0.0835 [0.0562, 0.1108]
x10 0.0046 [−0.0225, 0.0324]
Second-order Sobol’ index estimates with 95 % bootstrap CI. The
horizontal dotted line marks the threshold (0.05).
-0.08
-0.04
0.00
0.04
S1,2 S1,6 S1,10 S2,6 S2,10 S3,7 S4,5 S4,9 S5,8 S6,8 S7,9 S9,10
36/ 61
54. III- Recursive estimation procedure
Recursive estimation has some interest for several purposes:
to avoid storage issues: with the classical approach, the
indices are estimated postmortem, requiring the storage of all
simulation results;
to be able to add new points to initial designs to increase the
accuracy if one gets new budget.
We need a recursive formula for Sobol’ estimates. Let us give that
formula for first-order Sobol’ indices, in the case where one point is
added at each step. This formula can be generalized.
37/ 61
55. III- Recursive estimation procedure
Recall that:
ˆS{i},n =
1
n
n
j=1
Y j,1
Y j,2
{i} −
1
n
n
j=1
Y j,1
+ Y j,2
{i}
2
2
1
n
n
j=1
(Y j,1
)2
+ (Y j,2
{i})2
2
−
1
n
n
j=1
Y j,1
+ Y j,2
{i}
2
2
We then write:
ˆS{i},n =
φn − ψ2
n
ξn − ψ2
n
,
nφn = (n − 1)φn−1 + Y n,1
Y n,2
{i} ,
nψn = (n − 1)ψn−1 +
Y n,1
+Y n,2
{i}
2 ,
nξn = (n − 1)ξn−1 +
(Y n,1
)2
+(Y n,2
{i}
)2
2
38/ 61
56. III- Recursive estimation procedure
Nested designs:
At step l of the recursion
Block structure:
Dl =
B1
...
Bl
, D l =
B1
...
Bl
• Bl , Bl : new points sets added
• Dl = Dl−1 ∪ Bl , D l = D l−1 ∪ Bl
• Requirement: Bl and Bl are two replicated designs
Choice of the DoE:
• first-order: nested Latin Hypercube samples [Qian, 2009]
• second-order: nested Orthogonal Arrays of strength two
[Gilquin et al., 2016]
39/ 61
57. III- Recursive estimation procedure
Nested Latin Hypercube: definition,construction [Qian, 2009]
Parameters:
• vector s = (s1, . . . , su) ∈ (Z>)u
→ u blocks Bl , l = 1, . . . , u
• size of Dl : ml =
l
k=1
sk
• size of the nested Latin Hypercube n =
u
k=1
sk
Property:
At each step l, Dl possesses a Latin Hypercube structure when
projected into a sub-grid.
The smallest number of points per layer is obtained with
(m1, m2, m3, m4, m5, . . .) = (1, 1, 2, 22, 23, . . .).
40/ 61
61. III- Recursive estimation procedure
Nested Orthogonal Array: 2–(q, d, λ) OA, size λqt
[Gilquin et al., 2016]
Parameters:
• strength t
• levels q
• columns d
• index λ
When λ > 1, 2–(q, d, λ) OA: composed with λ blocks where each
block is a 2–(q, d, 1) OA.
Idea: Construct a collection of 2–(q, d, 1) OAs and select the
blocks Bl from the collection.
In the following we present two approaches.
44/ 61
63. III- Recursive estimation procedure
First approach: Algebraic construction
Results from coding theory [Stinson and Massey, 1995]: ∃
collection of 2–(q, d, 1) OAs forming a partition of (Nq)d ,
|(Nq)d | = qd
Size of the collection: qd−t
Construction: at step l, block Bl picked randomly in the collection.
Second approach: Accept-reject construction
Construction: at step l:
• construct Bl : randomised Orthogonal Array [Owen, 1992]
• test if rows of Bl = rows of Dl−1
These designs can be compared by looking at criteria testing
regularity (mindist) and uniformity (discrepancy) properties.
46/ 61
68. IV- A few open questions
• It might be more efficient to use Sobol’ or other digital
sequences as DoE (e.g., under further assumptions on the
Walsh decomp. of M). The iterative construction of a
2d-dimensional Sobol’ sequence doubles the number of points
at each step. In [Gilquin et al., 2017b], the authors propose
an additive approach, based on group algebra. The main
drawback is the correlation between blocks. Is it possible to
do better ?
0.0010.0020.005
N
averageL2stardiscrepancy
q
q
q
q
q
q
q
q
q
q q q q qqq
256 512 768 1280 2048 3328
q
design M
design A
LH design 0.050.100.150.20
N
averagemaximin
q
q
q
q
q
q
q
q
q
q
q
q
q qqq
256 512 768 1280 2048 3328
q
design M
design A
LH design
51/ 61
69. IV- A few open questions
• QMC properties are related to effective dimension
[Wang and Fang, 2003]. Is it possible to infer effective
dimension without performing first a sensitivity analysis?
•
Is it possible to use active subsets tools to tune QMC?
• Is it possible to state an optimality criterion for the estimation
of Sobol’ indices ? Sobol’ index estimators are not integrals
but a combination of integrals.
52/ 61
70. IV- A few open questions
An alternative: Shapley setup
Let team u ⊆ {1, 2, ..., d} create value val(u).
Total value is val({1, 2, ..., d}).
We attribute φj of this to j ∈ {1, 2, ..., d}.
Shapley axioms
Efficiency d
j=1 φj = val({1, . . . , d}).
Dummy If val(u ∪ {i}) = val(u) ∀ u, then φi = 0.
Symmetry If val(u ∪ {i}) = val(u ∪ {j}) ∀ u ∩ {i, j} = ∅, then
φi = φj.
Additivity If games val, val’ have values φ, φ , then val + val’ has
value φj + φj.
[Shapley, 1953] shows there is a unique solution.
53/ 61
71. IV- A few open questions
The solution is:
φj =
1
d u⊆−{j}
d − 1
|u|
−1
val(u + j) − val(u)
Let variables x1, x2, . . . , xd be team members trying to explain M.
The value of any subset u is how much can be explained by xu.
We define: val(u) = v⊆u Sv . The corresponding φj is known as
Shapley effect [Owen, 2014].
For independent inputs, one gets: φj = u:j∈u
1
|u| Su.
For dependent inputs, that last formula is not true anymore, but
Shapley effects are still meaningful
[Owen and Prieur, 2017, Iooss and Prieur, 2017].
54/ 61
72. IV- A few open questions
• There exist algorithm for the estimation of Shapley effects
[Song et al., 2016]. It is also possible to use metamodels to
speed up these algorithms [Iooss and Prieur, 2017]. Is it
possible to write more efficient algorithms, adapted from
games theory?
55/ 61
73. Thanks
• Workshop Organizers
Art Owen
Fred Hickernell
Frances Kuo
Pierre L’Ecuyer
• SAMSI, Sr. Program Coordinator
Sue McDonald
• Co-authors
Elise Arnaud
Laurent Gilquin
Fred Hickernell
Bertrand Iooss
Alexandre Janon
Hervé Monod
Art Owen
Lluís Antoni Jiménez Rugama
Jean-Yves Tissot
56/ 61
74. Some references I
Forrester, A., Keane, A., et al. (2008).
Engineering design via surrogate modelling: a practical guide.
John Wiley & Sons.
Gilquin, L., Arnaud, E., Prieur, C., and Janon, A. (2017a).
Making best use of permutations to compute sensitivity indices with replicated
designs.
https://hal.inria.fr/hal-01558915.
Gilquin, L., Arnaud, E., Prieur, C., and Monod, H. (2016).
Recursive estimation procedure of Sobol’ indices based on replicated designs.
https://hal.inria.fr/hal-01291769.
Gilquin, L., Rugama, L. A. J., Arnaud, E., Hickernell, F. J., Monod, H., and
Prieur, C. (2017b).
Iterative construction of replicated designs based on sobol’sequences.
Comptes Rendus Mathematique, 355(1):10–14.
Hoeffding, W. F. (1948).
A class of statistics with asymptotically normal distributions.
Annals of Mathematical Statistics, 19:293–325.
57/ 61
75. Some references II
Iooss, B. and Prieur, C. (2017).
Shapley effects for sensitivity analysis with dependent inputs: comparisons with
Sobol’ indices, numerical estimation and applications.
https://hal.inria.fr/hal-01556303.
Janon, A., Klein, T., Lagnoux, A., Nodet, M., and Prieur, C. (2014).
Asymptotic normality and efficiency of two sobol index estimators.
ESAIM: Probability and Statistics, 18:342–364.
Lamboni, M., Iooss, B., Popelin, A.-L., and Gamboa, F. (2013).
Derivative-based global sensitivity measures: general links with Sobol’ indices
and numerical tests.
Mathematics and Computers in Simulation, 87:45–54.
Mara, T. and Rakoto, J. (2008).
Comparison of some efficient methods to evaluate the main effect of computer
model factors.
Journal of Statistical Computation and Simulation, 78(2):167–178.
McKay, M. D. (1995).
Evaluating prediction uncertainty.
Technical Report NUREG/CR-6311, US Nuclear Regulatory Commission and Los
Alamos National Laboratory, pages 1–79.
58/ 61
76. Some references III
Monod, H., Naud, C., and Makowski, D. (2006).
Uncertainty and sensitivity analysis for crop models.
In Wallach, D., Makowski, D., and Jones, J. W., editors, Working with Dynamic
Crop Models: Evaluation, Analysis, Parameterization, and Applications,
chapter 4, pages 55–99. Elsevier.
Owen, A. and Prieur, C. (2017).
On Shapley value for measuring importance of dependent inputs.
SIAM/ASA Journal on Uncertainty Quantification, In press.
Owen, A. B. (1992).
Orthogonal arrays for computer experiments, integration and visualization.
Statistica Sinica, 2:439–452.
Owen, A. B. (2014).
Sobol’ indices and Shapley value.
Journal on Uncertainty Quantification, 2:245–251.
Qian, P. Z. G. (2009).
Nested Latin Hypercube designs.
Biometrika, 96(4):957–970.
59/ 61
77. Some references IV
Saltelli, A. (2002).
Making best use of model evaluations to compute sensitivity indices.
Computer Physics Communications, 145:280–297.
Shapley, L. S. (1953).
A value for n-person games.
In Kuhn, H. W. and Tucker, A. W., editors, Contribution to the Theory of
Games II (Annals of Mathematics Studies 28), pages 307–317. Princeton
University Press, Princeton, NJ.
Sobol’, I. M. (1993).
Sensitivity analysis for nonlinear mathematical models.
Mathematical Modeling and Computational Experiment, 1:407–414.
Sobol’, I. M. and Gresham, A. (1995).
On an alternative global sensitivity estimators.
Proceedings of SAMO, Belgirate, pages 40–42.
Sobol’, I. M. and Kucherenko, S. (2009).
Derivative based global sensitivity measures and the link with global sensitivity
indices.
Mathematics and Computers in Simulation, 79:3009–3017.
60/ 61
78. Some references V
Song, E., Nelson, B., and Staum, J. (2016).
Shapley effects for global sensitivity analysis: Theory and computation.
SIAM/ASA Journal of Uncertainty Quantification, 4:1060–1083.
Stinson, D. R. and Massey, J. L. (1995).
An infinite class of counterexamples to a conjecture concerning nonlinear resilient
functions.
J. Cryptology, 8(3):167–173.
Tissot, J. Y. and Prieur, C. (2015).
A randomized Orthogonal Array-based procedure for the estimation of first- and
second-order Sobol’ indices.
Journal of Statistical Computation and Simulation, 85(7):1358–1381.
Wang, X. and Fang, K.-T. (2003).
The effective dimension and quasi-monte carlo integration.
Journal of Complexity, 19:101–124.
61/ 61