SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Theory and Toolkits
of PCA
2009 5/4 IRLab
Study Group
Presenter : Chin-Hui Chen
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Scenario (Point? Line?)
 Consider a 2-dimension space
Least Squared Error
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
What is PCA ? (1)
 Principal component analysis (PCA)
involves a mathematical procedure that
transforms a number of possibly
correlated variables into a smaller number
of uncorrelated variables called “principal
components”.
What is PCA ? (2)
 What can PCA do ?
◦ Dimensionality Reduction
 For example :
◦ Assuming N points in D-dim space
◦ e.g. {x1, x2, x3, x4} ; xi = (v1, v2)
◦ A set (M) of basis for projection
◦ e.g. {u1}
 They are orthonormal bases (長度1,兩兩內積0)
 M << D (represent the feature in M dimensions)
◦ e.g. xi = (p1)
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
How to minimize Squared-Error ?
 Consider a D-dimension space
◦ Given N point : {x1, x2, …, xn}
◦ xi is a D-dim vector
 How to
◦ 1. 找一個點使得squared-error最小
◦ 2. 找一條線使得squared-error最小
How to ? - Point
◦ Goal : Find x0 s.t. min.
◦
◦ Let .
How to ? – Point - Line
 ∴ x0 =
◦ 1. 找一個點使得squared-error最小
◦ 2. 找一條線使得squared-error最小
 L : xk’- x0 = ake
 xk’= x0 + ake
 = m + ake
How to ? – Line
 L : xk’ = m + ake
 Goal :
 Find a1…an

How to ? – Line
 每個部份微分後 [2ak – 2et(xk-m)]

 What does it mean ?
xk’ = m + ake
How to ? – Line
 Then, how about e ?
How to ? – Line
 Let
Independent of e
How to ? – Line
f(x,y) ->
But if x,y : g(x,y)=0
 J’1(e) = -etSe
 Use lagrange multiplier :

 Because |e| = 1 , u = etSe – λ(ete-1)
How to ? – Line

◦ What is S ?
 Covariance Matrix (共變異數矩陣)
◦ Assume D-dim
How to ? – Line
 , we know S.
 Then, what is e ? Eigenvectors of S.
AX= λX Eigen : same
How to ? – conclusion
 Summary :
◦ Find a line : xk’= m + ake
 ak = et(xk-m)
 Se = λe ; e = eigenvectors of covariance matrix.
◦ D-dim space can find D eigenvectors.
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Dimensionality
Reduction
Dimensionality Reduction
 Consider a 2-dim space …
X1 = (a,b)
X2 = (c,d)
X1 = (a’,b’)
X2 = (c’,d’)
We are going to do …
X1 = (a’)
X2 = (c’)
Dimensionality Reduction
 We want to proof :
◦ Axes of the data are independent.
 Consider N m-dim vectors
◦ {x1, x2, … ,xn}
◦ Let X=[x1-m x2-m … xn-m]T m = mean
◦ Let E = [e1 e2 … em]
Se = λe
eigen decomposition Eigen vector {e1,…,em}
Eigen value {λ1,…, λm}
Dimensionality Reduction
 SE = [Se1 Se2 … Sem]
 = [λe1 λe2 … λem]

 =
 = ED
 S = EDE-1
E = [e1 e2 … em]
Dimensionality Reduction
 We want to know new Covariance Matrix
of projected vectors.
 Let Y = [y1 y2 … yn]T
 E = [e1 e2 … em]
 Y = ETX
 SY
Dimensionality Reduction
 SY = D
 1. Covariance of two axes are 0.
 2. represent data↑->covariance of axes↑
 -> λ ↑
Dimensionality Reduction
 Conclusion :
 If we want to reduce
 dimension D to M
 (M<<D)
 1. Find S
 2. ->eigenvalues
 3. Select Top M
 4. Project data
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Toolkits
A List of PCA Toolkits
 C & Java
◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources
◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/
 Perl
◦ PDL::PCA
 Matlab
◦ Statistics Toolbox™ : princomp
 Weka
◦ weka.attributeSelection.PrincipalComponents
(http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html )
A List of PCA Toolkits
 C & Java
◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources
◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/
C :
Download: pca.c
Compile: cc pca.c -lm -o pcac
Run: ./pcac spectr.dat 36 8 R > pcaout.c.txt
Java :
Download: JAMA, PCAcorr.java
Compile: javac –classpath Jama-1.0.2.jar PCAcorr.java
Run: java PCAcorr iris.dat > pcaout.java.txt
PCA (Principal component analysis) Theory and Toolkits
PCA (Principal component analysis) Theory and Toolkits

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis
 
Pca
PcaPca
Pca
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Pca ankita dubey
Pca ankita dubeyPca ankita dubey
Pca ankita dubey
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
Principal component analysis - application in finance
Principal component analysis - application in financePrincipal component analysis - application in finance
Principal component analysis - application in finance
 
Lda
LdaLda
Lda
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty Detection
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and more
 
Independent component analysis
Independent component analysisIndependent component analysis
Independent component analysis
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image Similarity
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithm
 

Ähnlich wie PCA (Principal component analysis) Theory and Toolkits

CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
RajJain516913
 
Line drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniquesLine drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniques
Ankit Garg
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
Roziq Bahtiar
 
INTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptxINTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptx
Devaraj Chilakala
 

Ähnlich wie PCA (Principal component analysis) Theory and Toolkits (20)

5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
 
principle component analysis.pptx
principle component analysis.pptxprinciple component analysis.pptx
principle component analysis.pptx
 
AAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptxAAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptx
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and Visualization
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCA
 
ML unit2.pptx
ML unit2.pptxML unit2.pptx
ML unit2.pptx
 
Line circle draw
Line circle drawLine circle draw
Line circle draw
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
Line drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniquesLine drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniques
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines Simply
 
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and Related
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
INTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptxINTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptx
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 

Mehr von HopeBay Technologies, Inc.

Mehr von HopeBay Technologies, Inc. (7)

COSCUP NAS也可以揀土豆
COSCUP NAS也可以揀土豆COSCUP NAS也可以揀土豆
COSCUP NAS也可以揀土豆
 
What is twitter a social network or news media?
What is twitter a social network or news media?What is twitter a social network or news media?
What is twitter a social network or news media?
 
Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...
 
Time is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter DataTime is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter Data
 
Mining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectoriesMining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectories
 
Deep Learning in a nutshell
Deep Learning in a nutshellDeep Learning in a nutshell
Deep Learning in a nutshell
 
A General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataA General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series Data
 

Kürzlich hochgeladen

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

Kürzlich hochgeladen (20)

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

PCA (Principal component analysis) Theory and Toolkits

  • 1. Theory and Toolkits of PCA 2009 5/4 IRLab Study Group Presenter : Chin-Hui Chen
  • 2. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 3. Scenario (Point? Line?)  Consider a 2-dimension space Least Squared Error
  • 4. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 5. What is PCA ? (1)  Principal component analysis (PCA) involves a mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called “principal components”.
  • 6. What is PCA ? (2)  What can PCA do ? ◦ Dimensionality Reduction  For example : ◦ Assuming N points in D-dim space ◦ e.g. {x1, x2, x3, x4} ; xi = (v1, v2) ◦ A set (M) of basis for projection ◦ e.g. {u1}  They are orthonormal bases (長度1,兩兩內積0)  M << D (represent the feature in M dimensions) ◦ e.g. xi = (p1)
  • 7. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 8. How to minimize Squared-Error ?  Consider a D-dimension space ◦ Given N point : {x1, x2, …, xn} ◦ xi is a D-dim vector  How to ◦ 1. 找一個點使得squared-error最小 ◦ 2. 找一條線使得squared-error最小
  • 9. How to ? - Point ◦ Goal : Find x0 s.t. min. ◦ ◦ Let .
  • 10. How to ? – Point - Line  ∴ x0 = ◦ 1. 找一個點使得squared-error最小 ◦ 2. 找一條線使得squared-error最小  L : xk’- x0 = ake  xk’= x0 + ake  = m + ake
  • 11. How to ? – Line  L : xk’ = m + ake  Goal :  Find a1…an 
  • 12. How to ? – Line  每個部份微分後 [2ak – 2et(xk-m)]   What does it mean ? xk’ = m + ake
  • 13. How to ? – Line  Then, how about e ?
  • 14. How to ? – Line  Let Independent of e
  • 15. How to ? – Line f(x,y) -> But if x,y : g(x,y)=0  J’1(e) = -etSe  Use lagrange multiplier :   Because |e| = 1 , u = etSe – λ(ete-1)
  • 16. How to ? – Line  ◦ What is S ?  Covariance Matrix (共變異數矩陣) ◦ Assume D-dim
  • 17. How to ? – Line  , we know S.  Then, what is e ? Eigenvectors of S. AX= λX Eigen : same
  • 18. How to ? – conclusion  Summary : ◦ Find a line : xk’= m + ake  ak = et(xk-m)  Se = λe ; e = eigenvectors of covariance matrix. ◦ D-dim space can find D eigenvectors.
  • 19. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 21. Dimensionality Reduction  Consider a 2-dim space … X1 = (a,b) X2 = (c,d) X1 = (a’,b’) X2 = (c’,d’) We are going to do … X1 = (a’) X2 = (c’)
  • 22. Dimensionality Reduction  We want to proof : ◦ Axes of the data are independent.  Consider N m-dim vectors ◦ {x1, x2, … ,xn} ◦ Let X=[x1-m x2-m … xn-m]T m = mean ◦ Let E = [e1 e2 … em] Se = λe eigen decomposition Eigen vector {e1,…,em} Eigen value {λ1,…, λm}
  • 23. Dimensionality Reduction  SE = [Se1 Se2 … Sem]  = [λe1 λe2 … λem]   =  = ED  S = EDE-1 E = [e1 e2 … em]
  • 24. Dimensionality Reduction  We want to know new Covariance Matrix of projected vectors.  Let Y = [y1 y2 … yn]T  E = [e1 e2 … em]  Y = ETX  SY
  • 25. Dimensionality Reduction  SY = D  1. Covariance of two axes are 0.  2. represent data↑->covariance of axes↑  -> λ ↑
  • 26. Dimensionality Reduction  Conclusion :  If we want to reduce  dimension D to M  (M<<D)  1. Find S  2. ->eigenvalues  3. Select Top M  4. Project data
  • 27. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 29. A List of PCA Toolkits  C & Java ◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources ◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/  Perl ◦ PDL::PCA  Matlab ◦ Statistics Toolbox™ : princomp  Weka ◦ weka.attributeSelection.PrincipalComponents (http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html )
  • 30. A List of PCA Toolkits  C & Java ◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources ◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/ C : Download: pca.c Compile: cc pca.c -lm -o pcac Run: ./pcac spectr.dat 36 8 R > pcaout.c.txt Java : Download: JAMA, PCAcorr.java Compile: javac –classpath Jama-1.0.2.jar PCAcorr.java Run: java PCAcorr iris.dat > pcaout.java.txt

Hinweis der Redaktion

  1. 因為這代表如果你已經知道 e , 將空間中任一點xk投射到t直線 L 上, 只需要將原座標為移後與 e 做內積, 就可以得到空間轉換後的新座標