The document discusses eigen decomposition and singular value decomposition (SVD). It defines eigenvalues and eigenvectors, and how they can be used to decompose a matrix into a diagonal matrix of eigenvalues multiplied by a matrix of eigenvectors. SVD similarly decomposes a matrix into the product of three matrices: left and right orthogonal matrices containing the singular vectors, and a diagonal matrix containing the singular values. SVD is useful for applications like matrix inversion, solving linear systems of equations, and dimensionality reduction.
2. Introduction
◼ Eigenvalue decomposition
❑ Spectral decomposition theorem
◼ Physical interpretation of eigenvalue/eigenvectors
◼ Singular Value Decomposition
◼ Importance of SVD
❑ Matrix inversion
❑ Solution to linear system of equations
❑ Solution to a homogeneous system of equations
◼ SVD application
3. What are eigenvalues?
◼ Given a matrix, A, x is the eigenvector and is the
corresponding eigenvalue if Ax = x
❑ A must be square and the determinant of A - I must be
equal to zero
Ax - x = 0 ! (A - I) x = 0
◼ Trivial solution is if x = 0
◼ The non trivial solution occurs when det(A - I) = 0
◼ Are eigenvectors are unique?
❑ If x is an eigenvector, then x is also an eigenvector and
is an eigenvalue
A(x) = (Ax) = (x) = (x)
4. Calculating the Eigenvectors/values
◼ Expand the det(A - I) = 0 for a 2 x 2 matrix
◼ For a 2 x 2 matrix, this is a simple quadratic equation with two solutions
(maybe complex)
◼ This “characteristic equation” can be used to solve for x
( )
( )( )
( ) ( ) 0
0
0
det
0
1
0
0
1
det
det
21
12
22
11
22
11
2
21
12
22
11
22
21
12
11
22
21
12
11
=
−
+
+
−
=
−
−
−
=
−
−
=
−
=
−
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
I
A
( ) ( )
( )
21
12
22
11
2
22
11
22
11
4 a
a
a
a
a
a
a
a
−
+
+
=
5. Eigenvalue example
◼ Consider,
◼ The corresponding eigenvectors can be computed as
❑ For = 0, one possible solution is x = (2, -1)
❑ For = 5, one possible solution is x = (1, 2)
( ) ( )
( )
=
=
+
=
=
−
+
+
−
=
−
+
+
−
=
5
,
0
)
4
1
(
0
2
2
4
1
)
4
1
(
0
4
2
2
1
2
2
21
12
22
11
22
11
2
a
a
a
a
a
a
A
=
−
+
−
=
−
−
=
−
=
=
+
+
=
=
−
=
0
0
1
2
2
4
1
2
2
4
0
5
0
0
5
4
2
2
1
5
0
0
4
2
2
1
4
2
2
1
0
0
0
0
0
4
2
2
1
0
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
6. Physical interpretation
◼ Consider a covariance matrix, A, i.e., A = 1/n S ST for some S
◼ Error ellipse with the major axis as the larger eigenvalue and
the minor axis as the smaller eigenvalue
25
.
0
,
75
.
1
1
75
.
75
.
1
2
1 =
=
=
A
7. Physical interpretation
◼ Orthogonal directions of greatest variance in data
◼ Projections along PC1 (Principal Component) discriminate the data most
along any one axis
Original Variable A
Original
Variable
B
PC 1
PC 2
8. Physical interpretation
◼ First principal component is the direction of
greatest variability (covariance) in the data
◼ Second is the next orthogonal (uncorrelated)
direction of greatest variability
❑ So first remove all the variability along the first
component, and then find the next direction of greatest
variability
◼ And so on …
◼ Thus each eigenvectors provides the directions of
data variances in decreasing order of eigenvalues
12. ◼ Let be a square matrix with m
linearly independent eigenvectors (a “non-
defective” matrix)
◼ Theorem: Exists an eigen decomposition
❑ (cf. matrix diagonalization theorem)
◼ Columns of U are eigenvectors of S
◼ Diagonal elements of are eigenvalues of
Eigen/diagonal Decomposition
diagonal
Unique
for
distinct
eigen-
values
13. Diagonal decomposition: why/how
= n
v
v
U ...
1
Let U have the eigenvectors as columns:
=
=
=
n
n
n
n
n v
v
v
v
v
v
S
SU
...
...
...
...
1
1
1
1
1
Then, SU can be written
And S=UU–1.
Thus SU=U, or U–1SU=
16. ◼ If is a symmetric matrix:
◼ Theorem: Exists a (unique) eigen
decomposition
◼ where Q is orthogonal:
❑ Q-1= QT
❑ Columns of Q are normalized eigenvectors
❑ Columns are orthogonal.
❑ (everything is real)
Symmetric Eigen Decomposition
T
Q
Q
S
=
17. Spectral Decomposition theorem
◼ If A is a symmetric and positive definite k x k matrix (xTAx
> 0) with i (i > 0) and ei, i = 1 k being the k
eigenvector and eigenvalue pairs, then
❑ This is also called the eigen decomposition theorem
◼ Any symmetric matrix can be reconstructed using its
eigenvalues and eigenvectors
( ) ( )( ) ( )( ) ( )( ) ( ) ( )( )
T
k
i k
T
i
k
i
i
k
k
k
T
k
k
k
k
k
T
k
k
T
k
k
k
P
P
e
e
A
e
e
e
e
e
e
A
=
=
+
+
=
=
1 1
1
1
1
1
2
1
2
2
1
1
1
1
1
( )
( )
=
=
k
k
k
k
k
k
0
0
0
0
0
0
,
, 2
1
2
1 e
e
e
P
19. Singular Value Decomposition
◼ If A is a rectangular m x k matrix of real numbers, then there exists an m x
m orthogonal matrix U and a k x k orthogonal matrix V such that
❑ is an m x k matrix where the (i, j)th entry i ¸ 0, i = 1 min(m, k) and the
other entries are zero
◼ The positive constants i are the singular values of A
◼ If A has rank r, then there exists r positive constants 1, 2,r, r
orthogonal m x 1 unit vectors u1,u2,,ur and r orthogonal k x 1 unit
vectors v1,v2,,vr such that
❑ Similar to the spectral decomposition theorem
( ) ( )( )( )
I
VV
UU
V
U
A =
=
=
T
T
k
k
T
k
m
m
m
k
m
=
=
r
i
T
i
i
i
1
v
u
A
20. Singular Value Decomposition (contd.)
◼ If A is a symmetric and positive definite
then
❑ SVD = Eigen decomposition
◼ EIG(i) = SVD(i
2)
◼ Here AAT has an eigenvalue-eigenvector
pair (i
2,ui)
◼ Alternatively, the vi are the eigenvectors of
ATA with the same non zero eigenvalue i
2
T
T
V
V
A
A 2
=
( )( )
T
T
T
T
T
T
T
U
U
U
V
V
U
V
U
V
U
AA
2
=
=
=
21. Example for SVD
◼ Let A be a symmetric, positive definite matrix
❑ U can be computed as
❑ V can be computed as
( )
−
=
=
=
=
=
−
=
−
−
=
−
=
2
1
,
2
1
,
2
1
,
2
1
10
,
12
0
det
11
1
1
11
1
1
3
1
1
3
1
3
1
1
1
3
1
3
1
1
1
3
2
1
2
1
T
T
T
T
u
u
I
AA
AA
A
( )
−
=
−
=
=
=
=
=
=
−
=
−
−
=
−
=
30
5
,
30
2
,
30
1
,
0
,
5
1
,
5
2
,
6
1
,
6
2
,
6
1
0
,
10
,
12
0
det
2
4
2
4
10
0
2
0
10
1
3
1
1
1
3
1
1
3
1
1
3
1
3
1
1
1
3
3
2
1
3
2
1
T
T
T
T
T
v
v
v
I
A
A
A
A
A
22. Example for SVD
◼ Taking 2
1=12 and 2
2=10, the singular value
decomposition of A is
◼ Thus the U, V and are computed by performing eigen
decomposition of AAT and ATA
◼ Any matrix has a singular value decomposition but only
symmetric, positive definite matrices have an eigen
decomposition
−
−
+
=
−
=
0
,
5
1
,
5
2
2
1
2
1
10
6
1
,
6
2
,
6
1
2
1
2
1
12
1
3
1
1
1
3
A
23. Applications of SVD in Linear Algebra
◼ Inverse of a n x n square matrix, A
❑ If A is non-singular, then A-1 = (UVT)-1= V-1UT where
-1=diag(1/1, 1/1,, 1/n)
❑ If A is singular, then A-1 = (UVT)-1= V0
-1UT where
0
-1=diag(1/1, 1/2,, 1/i,0,0,,0)
◼ Least squares solutions of a mxn system
❑ Ax=b (A is mxn, m¸n) =(ATA)x=ATb ) x=(ATA)-1 ATb=A+b
❑ If ATA is singular, x=A+b¼ (V0
-1UT)b where 0
-1 = diag(1/1,
1/2,, 1/i,0,0,,0)
◼ Condition of a matrix
❑ Condition number measures the degree of singularity of A
◼ Larger the value of 1/n, closer A is to being singular
24. Applications of SVD in Linear Algebra
◼ Homogeneous equations, Ax = 0
❑ Minimum-norm solution is x=0
(trivial solution)
❑ Impose a constraint,
❑ “Constrained” optimization
problem
❑ Special Case
◼ If rank(A)=n-1 (m ¸ n-1, n=0)
then x= vn ( is a constant)
❑ Genera Case
◼ If rank(A)=n-k (m ¸ n-k, n-
k+1== n=0) then x=1vn-
k+1++kvn with 2
1++2
n=1
For proof: Johnson and Wichern, “Applied Multivariate Statistical Analysis”, pg 79
Ax
x 1
min =
1
=
x
◼ Has appeared before
❑ Homogeneous solution of a linear
system of equations
❑ Computation of Homogrpahy
using DLT
❑ Estimation of Fundamental matrix
25. What is the use of SVD?
◼ SVD can be used to compute
optimal low-rank approximations
of arbitrary matrices.
◼ Face recognition
❑ Represent the face images as
eigenfaces and compute distance
between the query face image in the
principal component space
◼ Data mining
❑ Latent Semantic Indexing for
document extraction
◼ Image compression
❑ Karhunen Loeve (KL) transform
performs the best image
compression
◼ In MPEG, Discrete Cosine
Transform (DCT) has the closest
approximation to the KL transform
in PSNR
28. ◼ SVD can be used to compute optimal low-
rank approximations.
◼ Approximation problem: Find Ak of rank k
such that
◼ Ak and X are both mn matrices.
Typically, want k << r.
Low-rank Approximation
Frobenius norm
F
k
X
rank
X
k X
A
A −
=
=
min)
(
:
29. ◼ Solution via SVD
Low-rank Approximation
set smallest r-k
singular values to zero
T
k
k V
U
A )
0
,...,
0
,
,...,
(
diag 1
=
column notation: sum
of rank 1 matrices
T
i
i
k
i i
k v
u
A =
= 1
k
30. Approximation error
◼ How good (bad) is this approximation?
◼ It’s the best possible, measured by the
Frobenius norm of the error:
where the i are ordered such that i i+1.
Suggests why Frobenius error drops as k
increased.
1
)
(
:
min +
=
=
−
=
− k
F
k
F
k
X
rank
X
A
A
X
A