Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Introduction to Linear Discriminant Analysis

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Cluster analysis
Cluster analysis
Wird geladen in …3
×

Hier ansehen

1 von 41 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Introduction to Linear Discriminant Analysis (20)

Anzeige

Aktuellste (20)

Introduction to Linear Discriminant Analysis

  1. 1. Introduction to Linear Discriminant Analysis Jaclyn Kokx
  2. 2. 1. Who am I? 2. What is LDA? 3. Why use LDA? 4. How do I use LDA? https://memegenerator.net/instance/55587306/winter-is-coming-brace-yourself-powerpoint-slides-are-coming
  3. 3. Who Am I?
  4. 4. Jackie Kokx MS Mechanical Engineering BS Materials Science & Engineering Biotech Industry Algebra Instructor Self-Taught Data Enthusiast Ultimate Frisbee, Running, Mom www.jaclynkokx.com
  5. 5. What is LDA?
  6. 6. Linear Discriminant Analysis A supervised dimensionality reduction technique to be used with continuous independent variables and a categorical dependent variables A linear combination of features separates two or more classes Because it works with numbers and sounds science-y
  7. 7. More LDA Ronald Fisher Fisher’s Linear Discriminant (2-Class Method) - 1936 C R Rao Multiple Discriminant Analysis - 1948 Also used As a data classification technique http://www.swlearning.com/quant/kohler/stat/biographical_sketches/Fisher_3.jpeg http://www.buffalo.edu/ubreporter/archive/2011_07_21/rao_guy_medal.html
  8. 8. Why Would I Want to Use LDA?
  9. 9. Because real life data is ugly. Iris Dataset Your (My) Dataset
  10. 10. The curse of dimensionality http://www.visiondummy.com/2014/04/curse-dimensionality-affect-classification/ Reducing the number of your features might improve your classifier
  11. 11. How much can I reduce my number of features? One less than the number of labeled classes you have My waterpoint dataset 1 2 3 4 5 6 87 Functional Functional Repair NonFuctional LDA2 My waterpoint dataset after LDA LDA1 Functional Functional Repair NonFuctional 3 Classes - 1 = 2 features (dimensions)
  12. 12. How Do I Use LDA?
  13. 13. Due credit to Sebastian Raschka http://sebastianraschka.com/Articles/2014_python_lda .html
  14. 14. https://www.researchgate.net/publication/316994943_Linear_discriminant_analysis_A_d etailed_tutorial
  15. 15. What you need to start: ● Labeled data ● Ordinal features (the number kind, not the category kind) ● Some idea about matrix algebra ● (Python) https://www.memecenter.com/fun/330902/matrix--expectation-vs-reality
  16. 16. The Big Picture Create a subspace that separates our classes really well, then we’ll project our data onto that subspace. Linear Discriminant: To find that optimal subspace we’re going to: ● Minimize within-class variance (we want tight groups) ● Maximize the between-class distance (we want our groups as far apart as possible)
  17. 17. Let’s visualize it Iris Dataset
  18. 18. Another View
  19. 19. Let’s start the calculations
  20. 20. 5 Steps to LDA 1) Means 2) Scatter Matrices 3) Finding Linear Discriminants 4) Subspace 5) Project Data Iris Dataset
  21. 21. Step 1: Means 1. Find each class mean 1. Find the overall mean (central point)
  22. 22. Sepal Petal Width Width 3.504 0.248 3.408 0.213 2.975 0.327 3.225 0.289 2.648 0.222 3.346 0.203 3.198 0.255 2.899 0.301 3.045 0.297 Mean s Sepal Petal Width Width µ0 µ1 µ2
  23. 23. 5 Steps to LDA 1) Means 2) Scatter Matrices 3) Finding Linear Discriminants 4) Subspace 5) Project Data Iris Dataset
  24. 24. Step 2 : Scatter Matrices 2a) Between-Class Scatter Matrix (Sb) 2b) Within-Class Scatter Matrix (Sw)
  25. 25. 2a) Between-Class Scatter Matrix µi - CP µi - CP Ni = Sample Size of Class Squared Difference M = # of Features
  26. 26. Between-Class Scatter Matrix µi - CP µi - CP
  27. 27. 2b) Within-Class Scatter Matrix µi µi
  28. 28. Within-Class Scatter Matrix µi µi
  29. 29. 5 Steps to LDA 1) Means 2) Scatter Matrices 3) Finding Linear Discriminants 4) Subspace 5) Project Data Iris Dataset
  30. 30. Step 3: Finding Linear Discriminants Finding W using eigenvectors and eigenvalues Distance
  31. 31. Now, remember… Iris Dataset
  32. 32. The Math of Finding Linear Discriminants http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf Working towards: We have so far: Projected means Scatter of the projection
  33. 33. The Math of Linear Discriminants Fisher’s Criterion Eigenvector EigenvalueMatrix
  34. 34. Eigenvectors and Eigenvalues using NumPy YouTube: PatrickJMT Quadratic Equation W A A
  35. 35. 5 Steps to LDA 1) Means 2) Scatter Matrices 3) Finding Linear Discriminants 4) Subspace 5) Project Data Iris Dataset
  36. 36. Step 4: Subspace ● Sort our Eigenvectors by decreasing Eigenvalue ● Choose the top Eigenvectors to make your transformation matrix used to project your data Choose top (Classes - 1) Eigenvalues
  37. 37. 5 Steps to LDA 1) Means 2) Scatter Matrices 3) Finding Linear Discriminants 4) Subspace 5) Project Data Iris Dataset
  38. 38. Step 5: Project Data X_lda = X.dot(W) https://memegenerator.net/instance/48422270/willy-wonka-well-now-that-was-anticlimactic
  39. 39. Results and scikit-learn
  40. 40. Disclaimer Your (My) Dataset
  41. 41. Thanks! www.jaclynkokx.com jaclynkokx@gmail.com @JackieKokx

Hinweis der Redaktion

  • Speaker notes

×