Suche senden
Hochladen
Probability Density Functions
•
0 gefällt mir
•
3,115 views
G
guestb86588
Folgen
Technologie
Bildung
Melden
Teilen
Melden
Teilen
1 von 26
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Cramer row inequality
Cramer row inequality
VashuGupta8
Logistic regression
Logistic regression
Khaled Abd Elaziz
Probability&Bayes theorem
Probability&Bayes theorem
imran iqbal
Correlation Analysis
Correlation Analysis
Birinder Singh Gulati
Probability Distribution
Probability Distribution
Pharmacy Universe
Probability Density Function (PDF)
Probability Density Function (PDF)
AakankshaR
T distribution | Statistics
T distribution | Statistics
Transweb Global Inc
Lasso and ridge regression
Lasso and ridge regression
SreerajVA
Empfohlen
Cramer row inequality
Cramer row inequality
VashuGupta8
Logistic regression
Logistic regression
Khaled Abd Elaziz
Probability&Bayes theorem
Probability&Bayes theorem
imran iqbal
Correlation Analysis
Correlation Analysis
Birinder Singh Gulati
Probability Distribution
Probability Distribution
Pharmacy Universe
Probability Density Function (PDF)
Probability Density Function (PDF)
AakankshaR
T distribution | Statistics
T distribution | Statistics
Transweb Global Inc
Lasso and ridge regression
Lasso and ridge regression
SreerajVA
5 cramer-rao lower bound
5 cramer-rao lower bound
Solo Hermelin
Logistic Regression Analysis
Logistic Regression Analysis
COSTARCH Analytical Consulting (P) Ltd.
Binomial probability distributions
Binomial probability distributions
Long Beach City College
Least Squares Regression Method | Edureka
Least Squares Regression Method | Edureka
Edureka!
Bayesian Linear Regression.pptx
Bayesian Linear Regression.pptx
JerminJershaTC
Point estimation
Point estimation
Shahab Yaseen
Bernoulli distribution
Bernoulli distribution
Suchithra Edakunni
Lecture 4
Lecture 4
Subrat Sar
Linear regression
Linear regression
Babasab Patil
Bayesian statistics
Bayesian statistics
Alberto Labarga
Lesson 21: Partial Derivatives in Economics
Lesson 21: Partial Derivatives in Economics
Matthew Leingang
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Dexlab Analytics
Normal Distribution
Normal Distribution
DataminingTools Inc
Continuous Random variable
Continuous Random variable
Jay Patel
Presentation On Regression
Presentation On Regression
alok tiwari
Bayes' theorem
Bayes' theorem
Dr. C.V. Suresh Babu
Point and Interval Estimation
Point and Interval Estimation
Shubham Mehta
Maximum likelihood estimation
Maximum likelihood estimation
zihad164
Probability Distributions
Probability Distributions
CIToolkit
Linear regression
Linear regression
Karishma Chaudhary
Gaussians
Gaussians
guestfee8698
Probability density in data mining and coveriance
Probability density in data mining and coveriance
udhayax793
Weitere ähnliche Inhalte
Was ist angesagt?
5 cramer-rao lower bound
5 cramer-rao lower bound
Solo Hermelin
Logistic Regression Analysis
Logistic Regression Analysis
COSTARCH Analytical Consulting (P) Ltd.
Binomial probability distributions
Binomial probability distributions
Long Beach City College
Least Squares Regression Method | Edureka
Least Squares Regression Method | Edureka
Edureka!
Bayesian Linear Regression.pptx
Bayesian Linear Regression.pptx
JerminJershaTC
Point estimation
Point estimation
Shahab Yaseen
Bernoulli distribution
Bernoulli distribution
Suchithra Edakunni
Lecture 4
Lecture 4
Subrat Sar
Linear regression
Linear regression
Babasab Patil
Bayesian statistics
Bayesian statistics
Alberto Labarga
Lesson 21: Partial Derivatives in Economics
Lesson 21: Partial Derivatives in Economics
Matthew Leingang
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Dexlab Analytics
Normal Distribution
Normal Distribution
DataminingTools Inc
Continuous Random variable
Continuous Random variable
Jay Patel
Presentation On Regression
Presentation On Regression
alok tiwari
Bayes' theorem
Bayes' theorem
Dr. C.V. Suresh Babu
Point and Interval Estimation
Point and Interval Estimation
Shubham Mehta
Maximum likelihood estimation
Maximum likelihood estimation
zihad164
Probability Distributions
Probability Distributions
CIToolkit
Linear regression
Linear regression
Karishma Chaudhary
Was ist angesagt?
(20)
5 cramer-rao lower bound
5 cramer-rao lower bound
Logistic Regression Analysis
Logistic Regression Analysis
Binomial probability distributions
Binomial probability distributions
Least Squares Regression Method | Edureka
Least Squares Regression Method | Edureka
Bayesian Linear Regression.pptx
Bayesian Linear Regression.pptx
Point estimation
Point estimation
Bernoulli distribution
Bernoulli distribution
Lecture 4
Lecture 4
Linear regression
Linear regression
Bayesian statistics
Bayesian statistics
Lesson 21: Partial Derivatives in Economics
Lesson 21: Partial Derivatives in Economics
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Normal Distribution
Normal Distribution
Continuous Random variable
Continuous Random variable
Presentation On Regression
Presentation On Regression
Bayes' theorem
Bayes' theorem
Point and Interval Estimation
Point and Interval Estimation
Maximum likelihood estimation
Maximum likelihood estimation
Probability Distributions
Probability Distributions
Linear regression
Linear regression
Ähnlich wie Probability Density Functions
Gaussians
Gaussians
guestfee8698
Probability density in data mining and coveriance
Probability density in data mining and coveriance
udhayax793
Information Gain
Information Gain
guest32311f
Probability for Data Miners
Probability for Data Miners
guestfee8698
Bayesian Networks
Bayesian Networks
guestfee8698
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
guestfee8698
Lecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
Albert Orriols-Puig
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
Dongseo University
Ähnlich wie Probability Density Functions
(8)
Gaussians
Gaussians
Probability density in data mining and coveriance
Probability density in data mining and coveriance
Information Gain
Information Gain
Probability for Data Miners
Probability for Data Miners
Bayesian Networks
Bayesian Networks
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
Lecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
Kürzlich hochgeladen
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
charlottematthew16
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Kalema Edgar
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
Hervé Boutemy
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
The Digital Insurer
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Enterprise Knowledge
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Scott Keck-Warren
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
RankYa
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Training state-of-the-art general text embedding
Training state-of-the-art general text embedding
Zilliz
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
gvaughan
Kürzlich hochgeladen
(20)
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Training state-of-the-art general text embedding
Training state-of-the-art general text embedding
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
Probability Density Functions
1.
Probability Densities
in Data Mining Andrew W. Moore Note to other teachers and users of these slides. Andrew would be delighted Professor if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them School of Computer Science to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in Carnegie Mellon University your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: www.cs.cmu.edu/~awm http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully awm@cs.cmu.edu received. 412-268-7599 Copyright © Andrew W. Moore Slide 1 Probability Densities in Data Mining • Why we should care • Notation and Fundamentals of continuous PDFs • Multivariate continuous PDFs • Combining continuous and discrete random variables Copyright © Andrew W. Moore Slide 2 1
2.
Why we should
care • Real Numbers occur in at least 50% of database records • Can’t always quantize them • So need to understand how to describe where they come from • A great way of saying what’s a reasonable range of values • A great way of saying how multiple attributes should reasonably co-occur Copyright © Andrew W. Moore Slide 3 Why we should care • Can immediately get us Bayes Classifiers that are sensible with real-valued data • You’ll need to intimately understand PDFs in order to do kernel methods, clustering with Mixture Models, analysis of variance, time series and many other things • Will introduce us to linear and non-linear regression Copyright © Andrew W. Moore Slide 4 2
3.
A PDF of
American Ages in 2000 Copyright © Andrew W. Moore Slide 5 A PDF of American Ages in 2000 Let X be a continuous random variable. If p(x) is a Probability Density Function for X then… b P(a < X ≤ b ) = ∫ p( x)dx x=a 50 P(30 < Age ≤ 50 ) = ∫ p(age)dage age = 30 = 0.36 Copyright © Andrew W. Moore Slide 6 3
4.
Properties of PDFs
b P(a < X ≤ b ) = ∫ p( x)dx x=a That means… ⎛ h⎞ h P⎜ x − < X ≤ x + ⎟ p( x) = lim ⎝ 2⎠ 2 h h →0 ∂ P( X ≤ x ) = p (x) ∂x Copyright © Andrew W. Moore Slide 7 Properties of PDFs ∞ b P(a < X ≤ b ) = ∫ p( x)dx ∫ p( x)dx = 1 Therefore… x=a x = −∞ ∂ P( X ≤ x ) = p(x) ∀x : p ( x) ≥ 0 Therefore… ∂x Copyright © Andrew W. Moore Slide 8 4
5.
Talking to your
stomach • What’s the gut-feel meaning of p(x)? If p(5.31) = 0.06 and p(5.92) = 0.03 then when a value X is sampled from the distribution, you are 2 times as likely to find that X is “very close to” 5.31 than that X is “very close to” 5.92. Copyright © Andrew W. Moore Slide 9 Talking to your stomach • What’s the gut-feel meaning of p(x)? If p(5.31) = 0.06 and p(5.92) = 0.03 a b then when a value X is sampled from the distribution, you are 2 times as likely to find a that X is “very close to” 5.31 than that X is b “very close to” 5.92. Copyright © Andrew W. Moore Slide 10 5
6.
Talking to your
stomach • What’s the gut-feel meaning of p(x)? If 2z z p(5.31) = 0.03 and p(5.92) = 0.06 a b then when a value X is sampled from the distribution, you are 2 times as likely to find a that X is “very close to” 5.31 than that X is b “very close to” 5.92. Copyright © Andrew W. Moore Slide 11 Talking to your stomach • What’s the gut-feel meaning of p(x)? If αz z p(5.31) = 0.03 and p(5.92) = 0.06 a b then when a value X is sampled from the distribution, you are α times as likely to find a that X is “very close to” 5.31 than that X is b “very close to” 5.92. Copyright © Andrew W. Moore Slide 12 6
7.
Talking to your
stomach • What’s the gut-feel meaning of p(x)? If p(a) =α p (b) then when a value X is sampled from the distribution, you are α times as likely to find a that X is “very close to” 5.31 than that X is b “very close to” 5.92. Copyright © Andrew W. Moore Slide 13 Talking to your stomach • What’s the gut-feel meaning of p(x)? If p(a) =α p (b) then P ( a − h < X < a + h) =α lim P(b − h < X < b + h) h →0 Copyright © Andrew W. Moore Slide 14 7
8.
Yet another way
to view a PDF A recipe for sampling a random age. 1. Generate a random dot from the rectangle surrounding the PDF curve. Call the dot (age,d) 2. If d < p(age) stop and return age 3. Else try again: go to Step 1. Copyright © Andrew W. Moore Slide 15 Test your understanding • True or False: ∀x : p ( x) ≤ 1 ∀x : P ( X = x) = 0 Copyright © Andrew W. Moore Slide 16 8
9.
Expectations
E[X] = the expected value of random variable X = the average value we’d see if we took a very large number of random samples of X ∞ ∫ x p( x) dx = x = −∞ Copyright © Andrew W. Moore Slide 17 Expectations E[X] = the expected value of random variable X = the average value we’d see if we took a very large number of random samples of X ∞ ∫ x p( x) dx = x = −∞ E[age]=35.897 = the first moment of the shape formed by the axes and the blue curve = the best value to choose if you must guess an unknown person’s age and you’ll be fined the square of your error Copyright © Andrew W. Moore Slide 18 9
10.
Expectation of a
function μ=E[f(X)] = the expected value of f(x) where x is drawn from X’s distribution. = the average value we’d see if we took a very large number of random samples of f(X) ∞ ∫ f ( x) p( x) dx μ= E[age ] = 1786.64 2 x = −∞ ( E[age]) 2 = 1288.62 Note that in general: E[ f ( x)] ≠ f ( E[ X ]) Copyright © Andrew W. Moore Slide 19 Variance σ2 = Var[X] = the expected squared ∞ difference between ∫ (x − μ ) σ= p ( x) dx 2 2 x and E[X] x = −∞ = amount you’d expect to lose if you must guess an unknown person’s age and you’ll be fined the square of your error, Var[age] = 498.02 and assuming you play optimally Copyright © Andrew W. Moore Slide 20 10
11.
Standard Deviation
σ2 = Var[X] = the expected squared ∞ difference between ∫ (x − μ ) σ= p ( x) dx 2 2 x and E[X] x = −∞ = amount you’d expect to lose if you must guess an unknown person’s age and you’ll be fined the square of your error, Var[age] = 498.02 and assuming you play optimally σ = 22.32 σ = Standard Deviation = “typical” deviation of X from its mean σ = Var[ X ] Copyright © Andrew W. Moore Slide 21 In 2 dimensions p(x,y) = probability density of random variables (X,Y) at location (x,y) Copyright © Andrew W. Moore Slide 22 11
12.
In 2
Let X,Y be a pair of continuous random variables, and let R be some region of (X,Y) dimensions space… ∫∫ p( x, y)dydx P(( X , Y ) ∈ R) = ( x , y )∈R Copyright © Andrew W. Moore Slide 23 In 2 Let X,Y be a pair of continuous random variables, and let R be some region of (X,Y) dimensions space… ∫∫ p( x, y)dydx P(( X , Y ) ∈ R) = ( x , y )∈R P( 20<mpg<30 and 2500<weight<3000) = area under the 2-d surface within the red rectangle Copyright © Andrew W. Moore Slide 24 12
13.
In 2
Let X,Y be a pair of continuous random variables, and let R be some region of (X,Y) dimensions space… ∫∫ p( x, y)dydx P(( X , Y ) ∈ R) = ( x , y )∈R P( [(mpg-25)/10]2 + [(weight-3300)/1500]2 <1)= area under the 2-d surface within the red oval Copyright © Andrew W. Moore Slide 25 In 2 Let X,Y be a pair of continuous random variables, and let R be some region of (X,Y) dimensions space… ∫∫ p( x, y)dydx P(( X , Y ) ∈ R) = ( x , y )∈R Take the special case of region R = “everywhere”. Remember that with probability 1, (X,Y) will be drawn from “somewhere”. So.. ∞ ∞ ∫ ∫ p( x, y)dydx = 1 x = −∞ y = −∞ Copyright © Andrew W. Moore Slide 26 13
14.
In 2
Let X,Y be a pair of continuous random variables, and let R be some region of (X,Y) dimensions space… ∫∫ p( x, y)dydx P(( X , Y ) ∈ R) = ( x , y )∈R ⎛ h⎞ h h h P⎜ x − < X ≤ x + ∧ y− <Y ≤ y+ ⎟ p( x, y ) = lim ⎝ 2⎠ 2 2 2 h2 h →0 Copyright © Andrew W. Moore Slide 27 In m Let (X1,X2,…Xm) be an n-tuple of continuous random variables, and let R be some region dimensions of Rm … P(( X 1 , X 2 ,..., X m ) ∈ R) = ∫∫ ...∫ p( x , x ,..., x )dxm , ,...dx2 , dx1 m 1 2 ( x1 , x2 ,..., xm )∈R Copyright © Andrew W. Moore Slide 28 14
15.
Independence
X ⊥ Y iff ∀x, y : p( x, y ) = p( x) p( y ) If X and Y are independent then knowing the value of X does not help predict the value of Y mpg,weight NOT independent Copyright © Andrew W. Moore Slide 29 Independence X ⊥ Y iff ∀x, y : p( x, y ) = p( x) p( y ) If X and Y are independent then knowing the value of X does not help predict the value of Y the contours say that acceleration and weight are independent Copyright © Andrew W. Moore Slide 30 15
16.
Multivariate Expectation
μ X = E[ X] = ∫ x p(x)d x E[mpg,weight] = (24.5,2600) The centroid of the cloud Copyright © Andrew W. Moore Slide 31 Multivariate Expectation E[ f ( X)] = ∫ f (x) p(x)d x Copyright © Andrew W. Moore Slide 32 16
17.
Test your understanding
Question : When (if ever) does E[ X + Y ] = E[ X ] + E[Y ] ? •All the time? •Only when X and Y are independent? •It can fail even if X and Y are independent? Copyright © Andrew W. Moore Slide 33 Bivariate Expectation E[ f ( x, y )] = ∫ f ( x, y ) p ( x, y )dydx if f ( x, y ) = x then E[ f ( X , Y )] = ∫ x p( x, y )dydx if f ( x, y ) = y then E[ f ( X , Y )] = ∫ y p( x, y )dydx if f ( x, y ) = x + y then E[ f ( X , Y )] = ∫ ( x + y ) p( x, y )dydx E[ X + Y ] = E[ X ] + E[Y ] Copyright © Andrew W. Moore Slide 34 17
18.
Bivariate Covariance
σ xy = Cov[ X , Y ] = E[( X − μ x )(Y − μ y )] σ xx = σ 2 x = Cov[ X , X ] = Var[ X ] = E[( X − μ x ) 2 ] σ yy = σ 2 y = Cov[Y , Y ] = Var[Y ] = E[(Y − μ y ) 2 ] Copyright © Andrew W. Moore Slide 35 Bivariate Covariance σ xy = Cov[ X , Y ] = E[( X − μ x )(Y − μ y )] σ xx = σ 2 x = Cov[ X , X ] = Var[ X ] = E[( X − μ x ) 2 ] σ yy = σ 2 y = Cov[Y , Y ] = Var[Y ] = E[(Y − μ y ) 2 ] ⎛X⎞ Write X = ⎜ ⎟ , then ⎜Y ⎟ ⎝⎠ ⎛ σ 2 x σ xy ⎞ Cov[ X] = E[(X − μ x )(X − μ x ) ] = Σ = ⎜ ⎟ T ⎜σ 2⎟ ⎝ xy σ y ⎠ Copyright © Andrew W. Moore Slide 36 18
19.
Covariance Intuition
E[mpg,weight] = (24.5,2600) σ weight = 700 σ weight = 700 σ mpg = 8 σ mpg = 8 Copyright © Andrew W. Moore Slide 37 Covariance Intuition Principal Eigenvector Σ of E[mpg,weight] = (24.5,2600) σ weight = 700 σ weight = 700 σ mpg = 8 σ mpg = 8 Copyright © Andrew W. Moore Slide 38 19
20.
Covariance Fun Facts
⎛ σ 2 x σ xy ⎞ Cov[ X] = E[(X − μ x )(X − μ x ) ] = Σ = ⎜ ⎟ T ⎜σ 2⎟ ⎝ xy σ y ⎠ •True or False: If σxy = 0 then X and Y are independent •True or False: If X and Y are independent How could then σxy = 0 you prove or disprove •True or False: If σxy = σx σy then X and Y are these? deterministically related •True or False: If X and Y are deterministically related then σxy = σx σy Copyright © Andrew W. Moore Slide 39 General Covariance Let X = (X1,X2, … Xk) be a vector of k continuous random variables Cov[ X] = E[(X − μ x )(X − μ x )T ] = Σ Σ ij = Cov[ X i , X j ] = σ xi x j S is a k x k symmetric non-negative definite matrix If all distributions are linearly independent it is positive definite If the distributions are linearly dependent it has determinant zero Copyright © Andrew W. Moore Slide 40 20
21.
Test your understanding Question
: When (if ever) does Var[ X + Y ] = Var[ X ] + Var[Y ] ? •All the time? •Only when X and Y are independent? •It can fail even if X and Y are independent? Copyright © Andrew W. Moore Slide 41 Marginal Distributions ∞ ∫ p( x, y)dy p( x) = y = −∞ Copyright © Andrew W. Moore Slide 42 21
22.
Conditional p (mpg
| weight = 4600) Distributions p (mpg | weight = 3200) p (mpg | weight = 2000) p( x | y) = p.d.f. of X when Y = y Copyright © Andrew W. Moore Slide 43 Conditional p (mpg | weight = 4600) Distributions p ( x, y ) p( x | y ) = p( y) Why? p( x | y) = p.d.f. of X when Y = y Copyright © Andrew W. Moore Slide 44 22
23.
Independence Revisited
X ⊥ Y iff ∀x, y : p ( x, y ) = p ( x) p ( y ) It’s easy to prove that these statements are equivalent… ∀x, y : p ( x, y ) = p ( x) p ( y ) ⇔ ∀x, y : p ( x | y ) = p ( x) ⇔ ∀x, y : p ( y | x) = p ( y ) Copyright © Andrew W. Moore Slide 45 More useful stuff ∞ ∫ p ( x | y )dx = 1 (These can all be proved from definitions on x = −∞ previous slides) p ( x, y | z ) p( x | y, z ) = p( y | z ) p( y | x) p( x) Bayes p( x | y) = Rule p( y) Copyright © Andrew W. Moore Slide 46 23
24.
Mixing discrete and
continuous variables ⎛ ⎞ h h P⎜ x − < X ≤ x + ∧ A = v ⎟ p ( x, A = v) = lim ⎝ ⎠ 2 2 h h →0 ∞ nA ∑ ∫ p( x, A = v)dx = 1 v =1 x = −∞ P( A | x) p( x) Bayes p ( x | A) = Rule P ( A) Bayes p ( x | A) P ( A) P ( A | x) = Rule p ( x) Copyright © Andrew W. Moore Slide 47 Mixing discrete and continuous variables P(EduYears,Wealthy) Copyright © Andrew W. Moore Slide 48 24
25.
Mixing discrete and
continuous variables P(EduYears,Wealthy) P(Wealthy| EduYears) Copyright © Andrew W. Moore Slide 49 Mixing discrete and continuous variables P(EduYears,Wealthy) P(Wealthy| EduYears) P(EduYears|Wealthy) Renormalized Axes Copyright © Andrew W. Moore Slide 50 25
26.
What you should
know • You should be able to play with discrete, continuous and mixed joint distributions • You should be happy with the difference between p(x) and P(A) • You should be intimate with expectations of continuous and discrete random variables • You should smile when you meet a covariance matrix • Independence and its consequences should be second nature Copyright © Andrew W. Moore Slide 51 Discussion • Are PDFs the only sensible way to handle analysis of real-valued variables? • Why is covariance an important concept? • Suppose X and Y are independent real-valued random variables distributed between 0 and 1: • What is p[min(X,Y)]? • What is E[min(X,Y)]? • Prove that E[X] is the value u that minimizes E[(X-u)2] • What is the value u that minimizes E[|X-u|]? Copyright © Andrew W. Moore Slide 52 26
Jetzt herunterladen