SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
SLICED INVERSE REGRESSION
5
th
March 2015, Grenoble
ALESSANDRO CHIANCONE
STEPHANE GIRARD JOCELYN CHANUSSOT
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
1
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
1
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
1
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
2
Suppose k = 1 for simplicity: Y = f( T
X, ✏)
We want to find the direction that best explains Y.
In other words if Y is fixed then T
X should not vary. Consequently our goal
is to find a direction which minimizes the variations of T
X given Y .
To estimate T
X|Y we arrange Y in h slices each with the same number of
samples, our goal is to find a direction which minimizes the within-slice
variance of T
X under the constraint var( T
X) = 1.
Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance
matrix, the between-slice covariance matrix and the within-slice covariance
matrix an equivalent approach is to maximize the between-slice variance under
the same constraint.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
Y
T
X
2
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
S1
S2
Whithin-slice variance for slice S10
Y
T
X
2
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
2
Suppose k = 1 for simplicity: Y = f( T
X, ✏)
We want to find the direction that best explains Y.
In other words if Y is fixed then T
X should not vary. Consequently our goal
is to find a direction which minimizes the variations of T
X given Y .
To estimate T
X|Y we arrange Y in h slices each with the same number of
samples, our goal is to find a direction which minimizes the within-slice
variance of T
X under the constraint var( T
X) = 1.
Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance
matrix, the between-slice covariance matrix and the within-slice covariance
matrix an equivalent approach is to maximize the between-slice variance under
the same constraint.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
2
Suppose k = 1 for simplicity: Y = f( T
X, ✏)
We want to find the direction that best explains Y.
In other words if Y is fixed then T
X should not vary. Consequently our goal
is to find a direction which minimizes the variations of T
X given Y .
To estimate T
X|Y we arrange Y in h slices each with the same number of
samples, our goal is to find a direction which minimizes the within-slice
variance of T
X under the constraint var( T
X) = 1.
Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance
matrix, the between-slice covariance matrix and the within-slice covariance
matrix an equivalent approach is to maximize the between-slice variance under
the same constraint.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
3
SIR IN PRACTICE
SIR is quite simple to implement:
• Split Y in h slices
• Estimate using the slices, ˆ is the between-slice covariance matrix
• Compute the eigendecomposition of the matrix ˆ⌃ 1 ˆ, where ˆ⌃ is the
empirical covariance matrix of X
• Select the eigenvectors corresponding to the highest eigenvalues.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
4
SIR IN THEORY
Let us go back to the general case Y = f( T
1 X, T
2 X, ..., T
k X, ✏).
Since f is unknown it is not possible to directly retrieve 1, 2, ..., k but a
basis of the e.d.r. space S. It can be shown that the previous SIR methodology
solves this problem if the so-called Linearity Design Condition is verified:
(LDC) 8b 2 Rp
E(bT
X| T
1 X, T
2 X, ..., T
k X) = c0 + c1
T
1 X + ... + ck
T
k X for
some constants c0, ..., ck.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
5
Recently many papers pointed out limitations of SIR since the
eigendecomposition can be challenging when the covariance matrix ⌃ is ill
conditioned. Many solutions have been proposed starting from preprocessing
the data using PCA [1]
to a more comprehensive approach to regularize SIR [2]
.
The weakest point of SIR is the (LDC) which cannot be verified in practice
because it depends on the true directions 1, ..., k. The condition holds in
case of elliptic symmetry and more generally it has been shown [3]
that if the
dimension p tends to infinity the condition is always approximately verified.
[2]BERNARD-MICHEL, Caroline; GARDES, Laurent; GIRARD, Stéphane. Gaussian regularized sliced inverse regression. Statistics and Computing,
2009, 19.1: 85-98.
[1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics, 2004, 20.18:
3406-3412.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
5
Recently many papers pointed out limitations of SIR since the
eigendecomposition can be challenging when the covariance matrix ⌃ is ill
conditioned. Many solutions have been proposed starting from preprocessing
the data using PCA [1]
to a more comprehensive approach to regularize SIR [2]
.
The weakest point of SIR is the (LDC) which cannot be verified in practice
because it depends on the true directions 1, ..., k. The condition holds in
case of elliptic symmetry and more generally it has been shown [3]
that if the
dimension p tends to infinity the condition is always approximately verified.
[2]BERNARD-MICHEL, Caroline; GARDES, Laurent; GIRARD, Stéphane. Gaussian regularized sliced inverse regression. Statistics and Computing,
2009, 19.1: 85-98.
[3]HALL, Peter; LI, Ker-Chau. On almost linearity of low dimensional projections from high dimensional data. The annals of Statistics, 1993, 867-889.
[1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics, 2004, 20.18:
3406-3412.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
6
CLUSTER SIR
[1]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society, 2010, 39.2: 251-267.
The LDC is verified when X follows an elliptically symmetric distribution
(e.g. multi normality of X).
When X follows a Gaussian mixture model the condition does not globally
hold but it is verified locally (i.e. in each mixture).
Kuentz & Saracco [1]
clusterized X to force the condition to hold locally in
each cluster and then combine the result in each cluster to obtain the final
e.d.r. directions
Our work is based on this intuition. We first clusterize the predictor space X
then a greedy merging algorithm is proposed to assign each cluster to its e.d.r
space taking into account the size of the cluster on which SIR is performed.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
6
CLUSTER SIR
[1]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society, 2010, 39.2: 251-267.
The LDC is verified when X follows an elliptically symmetric distribution
(e.g. multi normality of X).
When X follows a Gaussian mixture model the condition does not globally
hold but it is verified locally (i.e. in each mixture).
Kuentz & Saracco [1]
clusterized X to force the condition to hold locally in
each cluster and then combine the result in each cluster to obtain the final
e.d.r. directions
Our work is based on this intuition. We first clusterize the predictor space X
then a greedy merging algorithm is proposed to assign each cluster to its e.d.r
space taking into account the size of the cluster on which SIR is performed.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
SIR - CLUSTER SIR k = 1
SIR - Y = f( T
X)
Cluster SIR- X = X1 [ X2 [ .... [ Xc,
where c is the number of clusters
Yi = f( T
Xi), i = 1, ..., c
The e.d.r. direction and the linking
function do not change depending
on the cluster.
7
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
SIR - CLUSTER SIR COLLABORATIVE SIRk = 1
X = X1 [ X2 [ ... [ Xc
Yi = fi( T
i Xi), i 2 { 1, ..., D}
The number D (D  c) of e.d.r.
directions is unknown.
The e.d.r. directions and the link
function may change depending
on the cluster.
A merging algorithm is introduced to
infer the number D based on the
collinearity of the vectors i.
SIR - Y = f( T
X)
Cluster SIR- X = X1 [ X2 [ .... [ Xc,
where c is the number of clusters
Yi = f( T
Xi), i = 1, ..., c
The e.d.r. direction and the linking
function do not change depending
on the cluster.
7
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
8
The directions 1, ..., c are obtained applying SIR independently in each
cluster.
A hierarchical structure is built to infer the number D of e.d.r. directions
using a proximity criteria.
The most collinear vector to a set of vectors A = { 1, ..., c} given the
proximity criterion m(a, b) = cos2
(a, b) = (aT
b)2
is the solution of the
following problem:
(A) = max
a2Rp
X
t2A
wtm( t, a) s.t. kak = 1
= largest eigenvalue of
X
t2A
wt t
T
t
where wt are weights and sum to one.
To build the hierarchy we consider the following iterative algorithm initialized
with the set A = {{ 1}, ..., { c}}:
while card(A) 6= 1
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
9
1 122 11
...
11 12
...
...
...
1 , ..., 12
1 , 2
1 , 2 11 , 12
number of merges
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
10
(A) = max
a2Rp
X
t2A
wtm( t, a) s.t. kak = 1
= largest eigenvalue of
X
t2A
wt t
T
t
where wt are weights and sum to one.
To build the hierarchy we consider the following iterative algorithm initialized
with the set A = {{ 1}, ..., { c}}:
while card(A) 6= 1
Let a, b 2 A such that (a [ b) > (c [ d)8c, d 2 A
A = (A  {a, b}) [ a [ b
end
at each step the cardinality of the set A decreases merging the most collinear
sets of directions. Therefore it is possible to infer the number D of underlying
e.d.r. spaces analyzing the values of in the hierarchy.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
11
REGULARIZATION
After merging, each cluster is assigned one of the D e.d.r. directions ˆ1, ..., ˆD
For each Xi, i = 1, ..., c we consider the D e.d.r directions and we analyze the
two-dimensional distributions:
(Yi, ˆT
j Xi) 8j = 1, ..., D
Then we select the direction ˆj⇤ , j⇤
= min
j=1,...,D
2,j, where 2,j is the second
eigenvalue of the covariance matrix Cov(Yi, ˆT
j Xi).
This step reconsiders the data to select the best direction from the pool of the
D estimated directions
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
12
EXPERIMENTAL RESULTS
[1]CHAVENT, Marie, et al. A sliced inverse regression approach for data stream. Computational Statistics, 2013, 1-24.
We simulated 100 di↵erent datasets following a gaussian mixture model:
• 10 mixture components
• uniform mixing proportions
• 2500 samples
• dimension p = 240
The response variable Y is generated using the hyperbolic sin link function:
Yi = sinh( T
i Xi) + ✏, ✏ independent of X.
We will show the case where i 2 { 1, 2} i.e. the number of e.d.r. directions
D = 2
The proximity criteria m(ˆ, ) = cos2
(ˆ, ) = (ˆT
)2
is evaluated to assess
the quality of the estimation [1]
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
13
−10
−5
0
5
10
−6−4−202468
−6
−4
−2
0
2
4
6
8
Projection on the first two principal components
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
14
The proximity criteria is computed using the estimation obtained in each
cluster independently, PC, and after collaborative SIR, PCM, for each run of
K-means:
The average PC is 0.4499 with a standard deviation of 0.0690
The average PCM is 0.7939 with a standard deviation of 0.0960
1 2 3 4 5 6 7 8 9
0
10
20
30
40
50
60
ˆD
0.2 0.25 0.3 0.35 0.4 0.45 0.5
0
5
10
15
20
25
PCM PC
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
15
Y
projection
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
16
REAL DATASET: HORSE MUSSELS
We consider the data example of horse mussels present in cluster SIR. The
observations correspond to n = 192 horse mussels captured in the Malborough
Sounds at the Northeast of New Zealand’s South Island. The response variable
Y is the muscle mass, the edible portion of the mussel, in grams. The
predictor X is of dimension p = 4 and measures numerical characteristics of
the shell: length, width, height, each in mm, and mass in grams.
We repeated the following algorithm 100 times:
(1) Randomly select 80% of training and 20% of test.
(2) Apply SIR, cluster SIR and collaborative SIR on the training.
(3) Regress the functions using the training samples
(4) Compute the Mean Absolute Relative Error (MARE) on the test
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
17
MARE
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
18
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
19
REAL DATASET: GALAXIES
We consider the data example of Galaxies. The observations correspond to
n = 292766 di↵erent galaxies. The response variable Y is the stellar formation
rate. The predictor X is of dimension p = 46 and is composed of spectral
characteristics of the galaxies
We applied Collaborative SIR on the whole dataset to investigate the presence
of subgroups and di↵erent directions.
After di↵erent runs and number k of clusters we can observe that there are
two di↵erent subgroups and hence directions 1, 2.
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
20
Stellarformationrate
XT
i, i = 1, 2
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
21
1
2
Di↵erences between 1 and 2
SLICED INVERSE REGRESSION
ALESSANDRO CHIANCONE
1 2 3
ALESSANDRO CHIANCONE
SEQUENTIAL DIMENSION REDUCTION
STEPHANE GIRARD JOCELYN CHANUSSOT
Given a response continuous variable Y 2 R and X 2 Rp
explanatory variables
we look for a functional relationship between the two:
Y = f(X, ✏), ✏ independent of X
It is reasonable to think that X contains more information that what we need
for our task of predicting Y , in addition when the dimension p is large
regression is a very hard task when we do not have information about f.
Given that a possible solution is to make a strong assumption:
Y = f( T
1 X, T
2 X, ..., T
k X, ✏)
The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction
(e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are
su cient to regress Y .
22
CONCLUSIONS
Slices Inverse Regression
is definitely and
underestimated method,
give it a try!
Since data nowadays are
complex and rich, our
Collaborative SIR may
catch the subgroups and
allows to reduce
dimensionality in an easy
and fast way
ROBUST
SIR
FUTURE WORK
Publications
Chini, Marco, Alessandro Chiancone, and Salvatore Stramondo. “Scale Object Selection (SOS) through a hierarchical segmentation by a
multi-spectral per-pixel classification.” Pattern Recognition Letters 49 (2014): 214-223.

Papers in preparation
Alessandro Chiancone, Stephane Girard and Jocelyn Chanussot. “Collaborative Sliced Inverse Regression.”
[4]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society,
2010, 39.2: 251-267.
[2]Bernard-Michel, C., Gardes, L., & Girard, S. (2009). Gaussian regularized sliced inverse regression. Statistics and Computing,
19(1), 85-98.
[3]HALL, Peter; LI, Ker-Chau. On almost linearity of low dimensional projections from high dimensional data. The annals of Statistics,
1993, 867-889.
[1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics,
2004, 20.18: 3406-3412.
[1]CHAVENT, Marie, et al. A sliced inverse regression approach for data stream. Computational Statistics, 2013, 1-24.
CLASSIFICATION
SEGMENTATIONREGRESSION
thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Note on Character Theory-summer 2013
Note on Character Theory-summer 2013Note on Character Theory-summer 2013
Note on Character Theory-summer 2013
Fan Huang (Wright)
 
Ode powerpoint presentation1
Ode powerpoint presentation1Ode powerpoint presentation1
Ode powerpoint presentation1
Pokkarn Narkhede
 
Introduction to Calculus 1
Introduction to Calculus 1Introduction to Calculus 1
Introduction to Calculus 1
David Rogers
 
Equivalence relations
Equivalence relationsEquivalence relations
Equivalence relations
Tarun Gehlot
 

Was ist angesagt? (20)

lacl (1)
lacl (1)lacl (1)
lacl (1)
 
Note on Character Theory-summer 2013
Note on Character Theory-summer 2013Note on Character Theory-summer 2013
Note on Character Theory-summer 2013
 
eatonmuirheadsoaita
eatonmuirheadsoaitaeatonmuirheadsoaita
eatonmuirheadsoaita
 
Inverse variation
Inverse variationInverse variation
Inverse variation
 
Devaney Chaos Induced by Turbulent and Erratic Functions
Devaney Chaos Induced by Turbulent and Erratic FunctionsDevaney Chaos Induced by Turbulent and Erratic Functions
Devaney Chaos Induced by Turbulent and Erratic Functions
 
Formal Logic - Lesson 8 - Predicates and Quantifiers
Formal Logic - Lesson 8 - Predicates and QuantifiersFormal Logic - Lesson 8 - Predicates and Quantifiers
Formal Logic - Lesson 8 - Predicates and Quantifiers
 
Interpolation
InterpolationInterpolation
Interpolation
 
Tensor analysis EFE
Tensor analysis  EFETensor analysis  EFE
Tensor analysis EFE
 
On the gravitational_field_of_a_point_mass_according_to_einstein_theory
On the gravitational_field_of_a_point_mass_according_to_einstein_theoryOn the gravitational_field_of_a_point_mass_according_to_einstein_theory
On the gravitational_field_of_a_point_mass_according_to_einstein_theory
 
Ode powerpoint presentation1
Ode powerpoint presentation1Ode powerpoint presentation1
Ode powerpoint presentation1
 
Structured Regularization for conditional Gaussian graphical model
Structured Regularization for conditional Gaussian graphical modelStructured Regularization for conditional Gaussian graphical model
Structured Regularization for conditional Gaussian graphical model
 
Rigidity And Tensegrity By Connelly
Rigidity And Tensegrity By ConnellyRigidity And Tensegrity By Connelly
Rigidity And Tensegrity By Connelly
 
Polyadic integer numbers and finite (m,n) fields
Polyadic integer numbers and finite (m,n) fieldsPolyadic integer numbers and finite (m,n) fields
Polyadic integer numbers and finite (m,n) fields
 
Introduction to Calculus 1
Introduction to Calculus 1Introduction to Calculus 1
Introduction to Calculus 1
 
compressed-sensing
compressed-sensingcompressed-sensing
compressed-sensing
 
Pmath 351 note
Pmath 351 notePmath 351 note
Pmath 351 note
 
A Unifying theory for blockchain and AI
A Unifying theory for blockchain and AIA Unifying theory for blockchain and AI
A Unifying theory for blockchain and AI
 
magnt
magntmagnt
magnt
 
International Journal of Mathematics and Statistics Invention (IJMSI)
International Journal of Mathematics and Statistics Invention (IJMSI) International Journal of Mathematics and Statistics Invention (IJMSI)
International Journal of Mathematics and Statistics Invention (IJMSI)
 
Equivalence relations
Equivalence relationsEquivalence relations
Equivalence relations
 

Andere mochten auch (15)

Robert
RobertRobert
Robert
 
Blum
BlumBlum
Blum
 
Massol bio info2011
Massol bio info2011Massol bio info2011
Massol bio info2011
 
Ae abc general_grenoble_29_juin_25_30_min
Ae abc general_grenoble_29_juin_25_30_minAe abc general_grenoble_29_juin_25_30_min
Ae abc general_grenoble_29_juin_25_30_min
 
DoThisApp
DoThisAppDoThisApp
DoThisApp
 
Mobile Commerce- POS
Mobile Commerce- POSMobile Commerce- POS
Mobile Commerce- POS
 
Jobim2011 o gimenez
Jobim2011 o gimenezJobim2011 o gimenez
Jobim2011 o gimenez
 
Chikhi grenoble bioinfo_biodiv_juin_2011
Chikhi grenoble bioinfo_biodiv_juin_2011Chikhi grenoble bioinfo_biodiv_juin_2011
Chikhi grenoble bioinfo_biodiv_juin_2011
 
Chiara focacci shaw
Chiara focacci shawChiara focacci shaw
Chiara focacci shaw
 
Recovery a ctivities
Recovery a ctivitiesRecovery a ctivities
Recovery a ctivities
 
Front-end Tooling - Dicas de ferramentas para melhorar a produtividade
Front-end Tooling - Dicas de ferramentas para melhorar a produtividadeFront-end Tooling - Dicas de ferramentas para melhorar a produtividade
Front-end Tooling - Dicas de ferramentas para melhorar a produtividade
 
Aussem
AussemAussem
Aussem
 
Ibm worklight
Ibm worklightIbm worklight
Ibm worklight
 
IBM Worklight- Checkout Process Architecture
IBM Worklight- Checkout Process ArchitectureIBM Worklight- Checkout Process Architecture
IBM Worklight- Checkout Process Architecture
 
Daunizeau
DaunizeauDaunizeau
Daunizeau
 

Ähnlich wie Presentation 5 march persyval

Function an old french mathematician said[1]12
Function an old french mathematician said[1]12Function an old french mathematician said[1]12
Function an old french mathematician said[1]12
Mark Hilbert
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
BhojRajAdhikari5
 
Lewenz_McNairs-copy
Lewenz_McNairs-copyLewenz_McNairs-copy
Lewenz_McNairs-copy
Anna Lewenz
 

Ähnlich wie Presentation 5 march persyval (20)

CH6.pdf
CH6.pdfCH6.pdf
CH6.pdf
 
Ch6
Ch6Ch6
Ch6
 
Puy chosuantai2
Puy chosuantai2Puy chosuantai2
Puy chosuantai2
 
Prob review
Prob reviewProb review
Prob review
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdf
 
Lesson 6 differentials parametric-curvature
Lesson 6 differentials parametric-curvatureLesson 6 differentials parametric-curvature
Lesson 6 differentials parametric-curvature
 
Congress
Congress Congress
Congress
 
Function an old french mathematician said[1]12
Function an old french mathematician said[1]12Function an old french mathematician said[1]12
Function an old french mathematician said[1]12
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
 
On elements of deterministic chaos and cross links in non- linear dynamical s...
On elements of deterministic chaos and cross links in non- linear dynamical s...On elements of deterministic chaos and cross links in non- linear dynamical s...
On elements of deterministic chaos and cross links in non- linear dynamical s...
 
04_AJMS_335_21.pdf
04_AJMS_335_21.pdf04_AJMS_335_21.pdf
04_AJMS_335_21.pdf
 
Ch5
Ch5Ch5
Ch5
 
Lewenz_McNairs-copy
Lewenz_McNairs-copyLewenz_McNairs-copy
Lewenz_McNairs-copy
 
Multivariate Gaussin, Rayleigh & Rician distributions
Multivariate Gaussin, Rayleigh & Rician distributionsMultivariate Gaussin, Rayleigh & Rician distributions
Multivariate Gaussin, Rayleigh & Rician distributions
 
Common Fixed Point Theorems in Compatible Mappings of Type (P*) of Generalize...
Common Fixed Point Theorems in Compatible Mappings of Type (P*) of Generalize...Common Fixed Point Theorems in Compatible Mappings of Type (P*) of Generalize...
Common Fixed Point Theorems in Compatible Mappings of Type (P*) of Generalize...
 
Fuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important resultsFuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important results
 
Differential equation and Laplace Transform
Differential equation and Laplace TransformDifferential equation and Laplace Transform
Differential equation and Laplace Transform
 
Differential equation and Laplace Transform
Differential equation and Laplace TransformDifferential equation and Laplace Transform
Differential equation and Laplace Transform
 
Differential equation
Differential equationDifferential equation
Differential equation
 

Presentation 5 march persyval

  • 1. SLICED INVERSE REGRESSION 5 th March 2015, Grenoble ALESSANDRO CHIANCONE STEPHANE GIRARD JOCELYN CHANUSSOT
  • 2. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 1 Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y .
  • 3. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 1 Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y .
  • 4. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 1 Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y .
  • 5. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 2 Suppose k = 1 for simplicity: Y = f( T X, ✏) We want to find the direction that best explains Y. In other words if Y is fixed then T X should not vary. Consequently our goal is to find a direction which minimizes the variations of T X given Y . To estimate T X|Y we arrange Y in h slices each with the same number of samples, our goal is to find a direction which minimizes the within-slice variance of T X under the constraint var( T X) = 1. Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance matrix, the between-slice covariance matrix and the within-slice covariance matrix an equivalent approach is to maximize the between-slice variance under the same constraint.
  • 6. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . Y T X 2
  • 7. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . S1 S2 Whithin-slice variance for slice S10 Y T X 2
  • 8. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 2 Suppose k = 1 for simplicity: Y = f( T X, ✏) We want to find the direction that best explains Y. In other words if Y is fixed then T X should not vary. Consequently our goal is to find a direction which minimizes the variations of T X given Y . To estimate T X|Y we arrange Y in h slices each with the same number of samples, our goal is to find a direction which minimizes the within-slice variance of T X under the constraint var( T X) = 1. Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance matrix, the between-slice covariance matrix and the within-slice covariance matrix an equivalent approach is to maximize the between-slice variance under the same constraint.
  • 9. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 2 Suppose k = 1 for simplicity: Y = f( T X, ✏) We want to find the direction that best explains Y. In other words if Y is fixed then T X should not vary. Consequently our goal is to find a direction which minimizes the variations of T X given Y . To estimate T X|Y we arrange Y in h slices each with the same number of samples, our goal is to find a direction which minimizes the within-slice variance of T X under the constraint var( T X) = 1. Since ˆ⌃ = ˆB + ˆW, where ˆ⌃, ˆB, ˆW are respectively the sample covariance matrix, the between-slice covariance matrix and the within-slice covariance matrix an equivalent approach is to maximize the between-slice variance under the same constraint.
  • 10. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 3 SIR IN PRACTICE SIR is quite simple to implement: • Split Y in h slices • Estimate using the slices, ˆ is the between-slice covariance matrix • Compute the eigendecomposition of the matrix ˆ⌃ 1 ˆ, where ˆ⌃ is the empirical covariance matrix of X • Select the eigenvectors corresponding to the highest eigenvalues.
  • 11. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 4 SIR IN THEORY Let us go back to the general case Y = f( T 1 X, T 2 X, ..., T k X, ✏). Since f is unknown it is not possible to directly retrieve 1, 2, ..., k but a basis of the e.d.r. space S. It can be shown that the previous SIR methodology solves this problem if the so-called Linearity Design Condition is verified: (LDC) 8b 2 Rp E(bT X| T 1 X, T 2 X, ..., T k X) = c0 + c1 T 1 X + ... + ck T k X for some constants c0, ..., ck.
  • 12. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 5 Recently many papers pointed out limitations of SIR since the eigendecomposition can be challenging when the covariance matrix ⌃ is ill conditioned. Many solutions have been proposed starting from preprocessing the data using PCA [1] to a more comprehensive approach to regularize SIR [2] . The weakest point of SIR is the (LDC) which cannot be verified in practice because it depends on the true directions 1, ..., k. The condition holds in case of elliptic symmetry and more generally it has been shown [3] that if the dimension p tends to infinity the condition is always approximately verified. [2]BERNARD-MICHEL, Caroline; GARDES, Laurent; GIRARD, Stéphane. Gaussian regularized sliced inverse regression. Statistics and Computing, 2009, 19.1: 85-98. [1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics, 2004, 20.18: 3406-3412.
  • 13. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 5 Recently many papers pointed out limitations of SIR since the eigendecomposition can be challenging when the covariance matrix ⌃ is ill conditioned. Many solutions have been proposed starting from preprocessing the data using PCA [1] to a more comprehensive approach to regularize SIR [2] . The weakest point of SIR is the (LDC) which cannot be verified in practice because it depends on the true directions 1, ..., k. The condition holds in case of elliptic symmetry and more generally it has been shown [3] that if the dimension p tends to infinity the condition is always approximately verified. [2]BERNARD-MICHEL, Caroline; GARDES, Laurent; GIRARD, Stéphane. Gaussian regularized sliced inverse regression. Statistics and Computing, 2009, 19.1: 85-98. [3]HALL, Peter; LI, Ker-Chau. On almost linearity of low dimensional projections from high dimensional data. The annals of Statistics, 1993, 867-889. [1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics, 2004, 20.18: 3406-3412.
  • 14. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 6 CLUSTER SIR [1]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society, 2010, 39.2: 251-267. The LDC is verified when X follows an elliptically symmetric distribution (e.g. multi normality of X). When X follows a Gaussian mixture model the condition does not globally hold but it is verified locally (i.e. in each mixture). Kuentz & Saracco [1] clusterized X to force the condition to hold locally in each cluster and then combine the result in each cluster to obtain the final e.d.r. directions Our work is based on this intuition. We first clusterize the predictor space X then a greedy merging algorithm is proposed to assign each cluster to its e.d.r space taking into account the size of the cluster on which SIR is performed.
  • 15. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 6 CLUSTER SIR [1]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society, 2010, 39.2: 251-267. The LDC is verified when X follows an elliptically symmetric distribution (e.g. multi normality of X). When X follows a Gaussian mixture model the condition does not globally hold but it is verified locally (i.e. in each mixture). Kuentz & Saracco [1] clusterized X to force the condition to hold locally in each cluster and then combine the result in each cluster to obtain the final e.d.r. directions Our work is based on this intuition. We first clusterize the predictor space X then a greedy merging algorithm is proposed to assign each cluster to its e.d.r space taking into account the size of the cluster on which SIR is performed.
  • 16. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . SIR - CLUSTER SIR k = 1 SIR - Y = f( T X) Cluster SIR- X = X1 [ X2 [ .... [ Xc, where c is the number of clusters Yi = f( T Xi), i = 1, ..., c The e.d.r. direction and the linking function do not change depending on the cluster. 7
  • 17. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . SIR - CLUSTER SIR COLLABORATIVE SIRk = 1 X = X1 [ X2 [ ... [ Xc Yi = fi( T i Xi), i 2 { 1, ..., D} The number D (D  c) of e.d.r. directions is unknown. The e.d.r. directions and the link function may change depending on the cluster. A merging algorithm is introduced to infer the number D based on the collinearity of the vectors i. SIR - Y = f( T X) Cluster SIR- X = X1 [ X2 [ .... [ Xc, where c is the number of clusters Yi = f( T Xi), i = 1, ..., c The e.d.r. direction and the linking function do not change depending on the cluster. 7
  • 18. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 8 The directions 1, ..., c are obtained applying SIR independently in each cluster. A hierarchical structure is built to infer the number D of e.d.r. directions using a proximity criteria. The most collinear vector to a set of vectors A = { 1, ..., c} given the proximity criterion m(a, b) = cos2 (a, b) = (aT b)2 is the solution of the following problem: (A) = max a2Rp X t2A wtm( t, a) s.t. kak = 1 = largest eigenvalue of X t2A wt t T t where wt are weights and sum to one. To build the hierarchy we consider the following iterative algorithm initialized with the set A = {{ 1}, ..., { c}}: while card(A) 6= 1
  • 19. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 9 1 122 11 ... 11 12 ... ... ... 1 , ..., 12 1 , 2 1 , 2 11 , 12 number of merges
  • 20. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 10 (A) = max a2Rp X t2A wtm( t, a) s.t. kak = 1 = largest eigenvalue of X t2A wt t T t where wt are weights and sum to one. To build the hierarchy we consider the following iterative algorithm initialized with the set A = {{ 1}, ..., { c}}: while card(A) 6= 1 Let a, b 2 A such that (a [ b) > (c [ d)8c, d 2 A A = (A {a, b}) [ a [ b end at each step the cardinality of the set A decreases merging the most collinear sets of directions. Therefore it is possible to infer the number D of underlying e.d.r. spaces analyzing the values of in the hierarchy.
  • 21. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 11 REGULARIZATION After merging, each cluster is assigned one of the D e.d.r. directions ˆ1, ..., ˆD For each Xi, i = 1, ..., c we consider the D e.d.r directions and we analyze the two-dimensional distributions: (Yi, ˆT j Xi) 8j = 1, ..., D Then we select the direction ˆj⇤ , j⇤ = min j=1,...,D 2,j, where 2,j is the second eigenvalue of the covariance matrix Cov(Yi, ˆT j Xi). This step reconsiders the data to select the best direction from the pool of the D estimated directions
  • 22. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 12 EXPERIMENTAL RESULTS [1]CHAVENT, Marie, et al. A sliced inverse regression approach for data stream. Computational Statistics, 2013, 1-24. We simulated 100 di↵erent datasets following a gaussian mixture model: • 10 mixture components • uniform mixing proportions • 2500 samples • dimension p = 240 The response variable Y is generated using the hyperbolic sin link function: Yi = sinh( T i Xi) + ✏, ✏ independent of X. We will show the case where i 2 { 1, 2} i.e. the number of e.d.r. directions D = 2 The proximity criteria m(ˆ, ) = cos2 (ˆ, ) = (ˆT )2 is evaluated to assess the quality of the estimation [1]
  • 23. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 13 −10 −5 0 5 10 −6−4−202468 −6 −4 −2 0 2 4 6 8 Projection on the first two principal components
  • 24. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 14 The proximity criteria is computed using the estimation obtained in each cluster independently, PC, and after collaborative SIR, PCM, for each run of K-means: The average PC is 0.4499 with a standard deviation of 0.0690 The average PCM is 0.7939 with a standard deviation of 0.0960 1 2 3 4 5 6 7 8 9 0 10 20 30 40 50 60 ˆD 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 5 10 15 20 25 PCM PC
  • 25. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 15 Y projection
  • 26. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 16 REAL DATASET: HORSE MUSSELS We consider the data example of horse mussels present in cluster SIR. The observations correspond to n = 192 horse mussels captured in the Malborough Sounds at the Northeast of New Zealand’s South Island. The response variable Y is the muscle mass, the edible portion of the mussel, in grams. The predictor X is of dimension p = 4 and measures numerical characteristics of the shell: length, width, height, each in mm, and mass in grams. We repeated the following algorithm 100 times: (1) Randomly select 80% of training and 20% of test. (2) Apply SIR, cluster SIR and collaborative SIR on the training. (3) Regress the functions using the training samples (4) Compute the Mean Absolute Relative Error (MARE) on the test
  • 27. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 17 MARE
  • 28. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 18
  • 29. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 19 REAL DATASET: GALAXIES We consider the data example of Galaxies. The observations correspond to n = 292766 di↵erent galaxies. The response variable Y is the stellar formation rate. The predictor X is of dimension p = 46 and is composed of spectral characteristics of the galaxies We applied Collaborative SIR on the whole dataset to investigate the presence of subgroups and di↵erent directions. After di↵erent runs and number k of clusters we can observe that there are two di↵erent subgroups and hence directions 1, 2.
  • 30. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 20 Stellarformationrate XT i, i = 1, 2
  • 31. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 21 1 2 Di↵erences between 1 and 2
  • 32. SLICED INVERSE REGRESSION ALESSANDRO CHIANCONE 1 2 3 ALESSANDRO CHIANCONE SEQUENTIAL DIMENSION REDUCTION STEPHANE GIRARD JOCELYN CHANUSSOT Given a response continuous variable Y 2 R and X 2 Rp explanatory variables we look for a functional relationship between the two: Y = f(X, ✏), ✏ independent of X It is reasonable to think that X contains more information that what we need for our task of predicting Y , in addition when the dimension p is large regression is a very hard task when we do not have information about f. Given that a possible solution is to make a strong assumption: Y = f( T 1 X, T 2 X, ..., T k X, ✏) The space S = Span{ 1, 2, ..., k} is called e↵ective dimension reduction (e.d.r.) space. Only few linear combinations of the predictors, k ⌧ p, are su cient to regress Y . 22 CONCLUSIONS Slices Inverse Regression is definitely and underestimated method, give it a try! Since data nowadays are complex and rich, our Collaborative SIR may catch the subgroups and allows to reduce dimensionality in an easy and fast way ROBUST SIR FUTURE WORK Publications Chini, Marco, Alessandro Chiancone, and Salvatore Stramondo. “Scale Object Selection (SOS) through a hierarchical segmentation by a multi-spectral per-pixel classification.” Pattern Recognition Letters 49 (2014): 214-223.
 Papers in preparation Alessandro Chiancone, Stephane Girard and Jocelyn Chanussot. “Collaborative Sliced Inverse Regression.”
  • 33. [4]KUENTZ, Vanessa; SARACCO, Jérôme. Cluster-based sliced inverse regression. Journal of the Korean Statistical Society, 2010, 39.2: 251-267. [2]Bernard-Michel, C., Gardes, L., & Girard, S. (2009). Gaussian regularized sliced inverse regression. Statistics and Computing, 19(1), 85-98. [3]HALL, Peter; LI, Ker-Chau. On almost linearity of low dimensional projections from high dimensional data. The annals of Statistics, 1993, 867-889. [1]LI, Lexin; LI, Hongzhe. Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics, 2004, 20.18: 3406-3412. [1]CHAVENT, Marie, et al. A sliced inverse regression approach for data stream. Computational Statistics, 2013, 1-24. CLASSIFICATION SEGMENTATIONREGRESSION thank you