SVD-1 (1).pdf

Eigen Decomposition and
Singular Value Decomposition

Introduction
◼ Eigenvalue decomposition
❑ Spectral decomposition theorem
◼ Physical interpretation of eigenvalue/eigenvectors
◼ Singular Value Decomposition
◼ Importance of SVD
❑ Matrix inversion
❑ Solution to linear system of equations
❑ Solution to a homogeneous system of equations
◼ SVD application

What are eigenvalues?
◼ Given a matrix, A, x is the eigenvector and  is the
corresponding eigenvalue if Ax = x
❑ A must be square and the determinant of A -  I must be
equal to zero
Ax - x = 0 ! (A - I) x = 0
◼ Trivial solution is if x = 0
◼ The non trivial solution occurs when det(A - I) = 0
◼ Are eigenvectors are unique?
❑ If x is an eigenvector, then x is also an eigenvector and
 is an eigenvalue
A(x) = (Ax) = (x) = (x)

Calculating the Eigenvectors/values
◼ Expand the det(A - I) = 0 for a 2 x 2 matrix
◼ For a 2 x 2 matrix, this is a simple quadratic equation with two solutions
(maybe complex)
◼ This “characteristic equation” can be used to solve for x
( )
( )( )
( ) ( ) 0
0
0
det
0
1
0
0
1
det
det
21
12
22
11
22
11
2
21
12
22
11
22
21
12
11
22
21
12
11
=
−
+
+
−
=
−
−
−

=






−
−
=














−






=
−
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
I
A








( ) ( )
( )
21
12
22
11
2
22
11
22
11
4 a
a
a
a
a
a
a
a
−
+

+
=


Eigenvalue example
◼ Consider,
◼ The corresponding eigenvectors can be computed as
❑ For  = 0, one possible solution is x = (2, -1)
❑ For  = 5, one possible solution is x = (1, 2)
( ) ( )
( )





=
=

+
=
=

−

+
+
−
=
−
+
+
−







=
5
,
0
)
4
1
(
0
2
2
4
1
)
4
1
(
0
4
2
2
1
2
2
21
12
22
11
22
11
2







 a
a
a
a
a
a
A






=






−
+
−
=













−
−

=



















−







=






=






+
+
=














=



















−







=
0
0
1
2
2
4
1
2
2
4
0
5
0
0
5
4
2
2
1
5
0
0
4
2
2
1
4
2
2
1
0
0
0
0
0
4
2
2
1
0
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x



Physical interpretation
◼ Consider a covariance matrix, A, i.e., A = 1/n S ST for some S
◼ Error ellipse with the major axis as the larger eigenvalue and
the minor axis as the smaller eigenvalue
25
.
0
,
75
.
1
1
75
.
75
.
1
2
1 =
=







= 

A

◼ Orthogonal directions of greatest variance in data
◼ Projections along PC1 (Principal Component) discriminate the data most
along any one axis
Original Variable A
Original
Variable
B
PC 1
PC 2

◼ First principal component is the direction of
greatest variability (covariance) in the data
◼ Second is the next orthogonal (uncorrelated)
direction of greatest variability
❑ So first remove all the variability along the first
component, and then find the next direction of greatest
variability
◼ And so on …
◼ Thus each eigenvectors provides the directions of
data variances in decreasing order of eigenvalues

Spherical, diagonal, full covariance

◼ Let be a square matrix with m
linearly independent eigenvectors (a “non-
defective” matrix)
◼ Theorem: Exists an eigen decomposition
❑ (cf. matrix diagonalization theorem)
◼ Columns of U are eigenvectors of S
◼ Diagonal elements of are eigenvalues of
Eigen/diagonal Decomposition
diagonal
Unique
for
distinct
eigen-
values

Diagonal decomposition: why/how










= n
v
v
U ...
1
Let U have the eigenvectors as columns:




















=










=










=
n
n
n
n
n v
v
v
v
v
v
S
SU



 ...
...
...
...
1
1
1
1
1
Then, SU can be written
And S=UU–1.
Thus SU=U, or U–1SU=

Diagonal decomposition - example
Recall .
3
,
1
;
2
1
1
2
2
1 =
=






= 

S
The eigenvectors and form








−1
1








1
1






−
=
1
1
1
1
U
Inverting, we have 




 −
=
−
2
/
1
2
/
1
2
/
1
2
/
1
1
U
Then, S=UU–1 = 




 −












− 2
/
1
2
/
1
2
/
1
2
/
1
3
0
0
1
1
1
1
1
Recall
UU–1 =1.

Example continued
Let’s divide U (and multiply U–1) by 2





 −












− 2
/
1
2
/
1
2
/
1
2
/
1
3
0
0
1
2
/
1
2
/
1
2
/
1
2
/
1
Then, S=
Q (Q-1= QT )


◼ If is a symmetric matrix:
◼ Theorem: Exists a (unique) eigen
decomposition
◼ where Q is orthogonal:
❑ Q-1= QT
❑ Columns of Q are normalized eigenvectors
❑ Columns are orthogonal.
❑ (everything is real)
Symmetric Eigen Decomposition
T
Q
Q
S 
=

Spectral Decomposition theorem
◼ If A is a symmetric and positive definite k x k matrix (xTAx
> 0) with i (i > 0) and ei, i = 1  k being the k
eigenvector and eigenvalue pairs, then
❑ This is also called the eigen decomposition theorem
◼ Any symmetric matrix can be reconstructed using its
eigenvalues and eigenvectors
( ) ( )( ) ( )( ) ( )( ) ( ) ( )( )
T
k
i k
T
i
k
i
i
k
k
k
T
k
k
k
k
k
T
k
k
T
k
k
k
P
P
e
e
A
e
e
e
e
e
e
A 
=
=

+
+
= 
= 









1 1
1
1
1
1
2
1
2
2
1
1
1
1
1 


 
( )
  ( )












=

=


k
k
k
k
k
k











0
0
0
0
0
0
,
, 2
1
2
1 e
e
e
P

Example for spectral decomposition
◼ Let A be a symmetric, positive definite matrix
◼ The eigenvectors for the corresponding eigenvalues
are
◼ Consequently,
( )
( ) ( )( ) 0
2
3
16
.
0
16
.
6
5
0
det
8
.
2
4
.
0
4
.
0
2
.
2
2
=
−
−
=
−
+
−

=
−







=




I
A
A





 −
=






=
5
1
,
5
2
,
5
2
,
5
1
2
1
T
T
e
e






−
−
+






=





 −










−
+
















=






=
4
.
0
8
.
0
8
.
0
6
.
1
4
.
2
2
.
1
2
.
1
6
.
0
5
1
5
2
5
1
5
2
2
5
2
5
1
5
2
5
1
3
8
.
2
4
.
0
4
.
0
2
.
2
A

◼ If A is a rectangular m x k matrix of real numbers, then there exists an m x
m orthogonal matrix U and a k x k orthogonal matrix V such that
❑  is an m x k matrix where the (i, j)th entry i ¸ 0, i = 1  min(m, k) and the
other entries are zero
◼ The positive constants i are the singular values of A
◼ If A has rank r, then there exists r positive constants 1, 2,r, r
orthogonal m x 1 unit vectors u1,u2,,ur and r orthogonal k x 1 unit
vectors v1,v2,,vr such that
❑ Similar to the spectral decomposition theorem
( ) ( )( )( )
I
VV
UU
V
U
A =
=

=




T
T
k
k
T
k
m
m
m
k
m

=
=
r
i
T
i
i
i
1
v
u
A 

Singular Value Decomposition (contd.)
◼ If A is a symmetric and positive definite
then
❑ SVD = Eigen decomposition
◼ EIG(i) = SVD(i
2)
◼ Here AAT has an eigenvalue-eigenvector
pair (i
2,ui)
◼ Alternatively, the vi are the eigenvectors of
ATA with the same non zero eigenvalue i
2
T
T
V
V
A
A 2

=
( )( )
T
T
T
T
T
T
T
U
U
U
V
V
U
V
U
V
U
AA
2

=


=


=

Example for SVD
◼ Let A be a symmetric, positive definite matrix
❑ U can be computed as
❑ V can be computed as
( ) 




 −
=






=

=
=

=
−






=









 −






−
=







−
=
2
1
,
2
1
,
2
1
,
2
1
10
,
12
0
det
11
1
1
11
1
1
3
1
1
3
1
3
1
1
1
3
1
3
1
1
1
3
2
1
2
1
T
T
T
T
u
u
I
AA
AA
A



( )





 −
=





 −
=






=

=
=
=

=
−










=






−









 −
=







−
=
30
5
,
30
2
,
30
1
,
0
,
5
1
,
5
2
,
6
1
,
6
2
,
6
1
0
,
10
,
12
0
det
2
4
2
4
10
0
2
0
10
1
3
1
1
1
3
1
1
3
1
1
3
1
3
1
1
1
3
3
2
1
3
2
1
T
T
T
T
T
v
v
v
I
A
A
A
A
A





Example for SVD
◼ Taking 2
1=12 and 2
2=10, the singular value
decomposition of A is
◼ Thus the U, V and  are computed by performing eigen
decomposition of AAT and ATA
◼ Any matrix has a singular value decomposition but only
symmetric, positive definite matrices have an eigen
decomposition





 −










−
+
















=






−
=
0
,
5
1
,
5
2
2
1
2
1
10
6
1
,
6
2
,
6
1
2
1
2
1
12
1
3
1
1
1
3
A

Applications of SVD in Linear Algebra
◼ Inverse of a n x n square matrix, A
❑ If A is non-singular, then A-1 = (UVT)-1= V-1UT where
-1=diag(1/1, 1/1,, 1/n)
❑ If A is singular, then A-1 = (UVT)-1= V0
-1UT where
0
-1=diag(1/1, 1/2,, 1/i,0,0,,0)
◼ Least squares solutions of a mxn system
❑ Ax=b (A is mxn, m¸n) =(ATA)x=ATb ) x=(ATA)-1 ATb=A+b
❑ If ATA is singular, x=A+b¼ (V0
-1UT)b where 0
-1 = diag(1/1,
1/2,, 1/i,0,0,,0)
◼ Condition of a matrix
❑ Condition number measures the degree of singularity of A
◼ Larger the value of 1/n, closer A is to being singular

Applications of SVD in Linear Algebra
◼ Homogeneous equations, Ax = 0
❑ Minimum-norm solution is x=0
(trivial solution)
❑ Impose a constraint,
❑ “Constrained” optimization
problem
❑ Special Case
◼ If rank(A)=n-1 (m ¸ n-1, n=0)
then x= vn ( is a constant)
❑ Genera Case
◼ If rank(A)=n-k (m ¸ n-k, n-
k+1== n=0) then x=1vn-
k+1++kvn with 2
1++2
n=1
For proof: Johnson and Wichern, “Applied Multivariate Statistical Analysis”, pg 79
Ax
x 1
min =
1
=
x
◼ Has appeared before
❑ Homogeneous solution of a linear
system of equations
❑ Computation of Homogrpahy
using DLT
❑ Estimation of Fundamental matrix

What is the use of SVD?
◼ SVD can be used to compute
optimal low-rank approximations
of arbitrary matrices.
◼ Face recognition
❑ Represent the face images as
eigenfaces and compute distance
between the query face image in the
principal component space
◼ Data mining
❑ Latent Semantic Indexing for
document extraction
◼ Image compression
❑ Karhunen Loeve (KL) transform
performs the best image
compression
◼ In MPEG, Discrete Cosine
Transform (DCT) has the closest
approximation to the KL transform
in PSNR

◼ Illustration of SVD dimensions and
sparseness

SVD example
Let









 −
=
0
1
1
0
1
1
A
Thus m=3, n=2. Its SVD is






−




















−
−
2
/
1
2
/
1
2
/
1
2
/
1
0
0
3
0
0
1
3
/
1
6
/
1
2
/
1
3
/
1
6
/
1
2
/
1
3
/
1
6
/
2
0
Typically, the singular values arranged in decreasing order.

◼ SVD can be used to compute optimal low-
rank approximations.
◼ Approximation problem: Find Ak of rank k
such that
◼ Ak and X are both mn matrices.
Typically, want k << r.
Low-rank Approximation
Frobenius norm
F
k
X
rank
X
k X
A
A −
=
=
min)
(
:

◼ Solution via SVD
Low-rank Approximation
set smallest r-k
singular values to zero
T
k
k V
U
A )
0
,...,
0
,
,...,
(
diag 1 

=
column notation: sum
of rank 1 matrices
T
i
i
k
i i
k v
u
A =
= 1

k

Approximation error
◼ How good (bad) is this approximation?
◼ It’s the best possible, measured by the
Frobenius norm of the error:
where the i are ordered such that i  i+1.
Suggests why Frobenius error drops as k
increased.
1
)
(
:
min +
=
=
−
=
− k
F
k
F
k
X
rank
X
A
A
X
A 

SVD-1 (1).pdf

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie SVD-1 (1).pdf

Ähnlich wie SVD-1 (1).pdf (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

SVD-1 (1).pdf