Intro probability 3

Probability Theory

Expectation
Phong VO
vdphong@fit.hcmus.edu.vn

September 11, 2010

– Typeset by FoilTEX –

Expectation of a Random Variable

Definition 1. The expected value, or mean, or first moment, of X is
defined to be

x xf (x) if X is discrete
E(X) = xdF (x) =
xf (x)dx if X is continuous

assuming that the sum (or integral) is well-defined. We use the following
notation to denote the expected value of X:

E(X) = EX = xdF (x) = µ = µX

– Typeset by FoilTEX – 1

• Think of E(X) as the average value you would obtain if you computed
n
the numerical average n−1 i=1 Xi of a large number of IID draws
X1, . . . , Xn.

Example 1. Find E[X] where X is the outcome when we roll a fair dice.

Example 2. Calculate E(X) when X is a Bernoulli random variable with
parameter p.

Example 3. Calculate E(X) when X is binomially distributed with
parameters n and p.

Example 4. Calculate the expectation of a geometric random variable
having parameter p.

Example 5. Calculate E(X) if X is a Poisson random variable with
parameter λ.


Example 6. Calculate the expectation of a random variable uniformly
distributed over (α, β)

Example 7. Let X be exponentially distributed with parameter λ.
Calculate E(X).

Example 8. Calculate E(X) when X is normally distributed with
parameters µ and σ 2


Expectation of a Function of a Random Variable

• If X is a discrete random variable with probability mass function p(x),
then for any real-valued function g,

E[g(X)] = g(x)p(x)
x:p(x)>0

• If X is a continuous random variable with probability density function
f (x), then for any real-valued function g,

∞
E[g(X)] = g(x)f (x)dx
−∞


Example 9. Let (X, Y ) have a jointly uniform distribution on the unit
square. Let Z = r(X, Y ) = X 2 + Y 2. Then,

E(Z) = r(x, y)dF (x, y) (1)
1 1
= (x2 + y 2)dxdy (2)
0 0
1 1
= x2dx + y 2dy (3)
0 0
2
= (4)
3


Properties of Expectations

Theorem 1. If X1, . . . , Xn are r.vs and a1, . . . , an are constants, then

E aiXi = a + iE(Xi)
i i

Theorem 2. Let X1, . . . , Xn be independent r.vs. Then,

n
E Xi = aiE(Xi).
i=1 i


Example 10. Let X ∼ Binomial(n, p). What is the mean of X? We
could try to appeal to the deﬁnition:

n
n
E(X) = xdFX (x) = xfX (x) = x px(1 − p)n−x
x
x x=0

n
but this is not an easy way to evaluate. Instead, note that X = i=1 Xi
where Xi = 1 if the ith toss is heads and Xi = 0 otherwise. Then E(Xi) =
(p × 1) + ((1 − p) × 0) = p and E(X) = E( i Xi) = i E(Xi) = np.


Variance and Covariance

Deﬁnition 2. Let X be a r.v with mean µ. The variance of X, denoted
by σ 2 or σX or V (X) or V X, is deﬁned by
2

σ 2 = E(X − µ)2 = (x − µ)2dF (x)

assuming this expectation exists. The standard deviation is sd(X) =
V (X) and is also denoted by σ and σX .


Theorem 3. Assuming the variance is well deﬁned, it has the following
properties:

1. V (X) = E(X 2) − µ2.

2. If a and b are constants then V (aX + b) = a2V (X).

3. If X1, . . . , Xn are independent and a1, . . . , an are constants, then

n n
V aiXi = a2V (X + i).
i
i=1 i=1


• If X1, . . . , Xn are r.vs then we deﬁne the sample mean to be

n
1
Xn = Xi
n i=1

and the sample variance to be

n
2 1
Sn = (Xi − Xn)2.
n − 1 i=1


Theorem 4. Let X1, . . . , Xn be IID and let µ = E(Xi), σ 2 = V (Xi).
Then

σ2
E(Xn = µ, V (Xn) = and E(Sn) = σ 2.
2
n


Deﬁnition 3. Let X and Y be r.vs. with means µX and µY and standard
deviations σX and σY . Deﬁne the covrariance between X and Y by

Cov(X, Y ) = E[(X − µX )(Y − µY )]

and the correlation by

Cov(X, Y )
ρ = ρX,Y = ρ(X, Y ) =
σX σY


Theorem 5. The covariance satisﬁes:

Cov(X, Y ) = E(XY ) − E(X)E(Y ).

The correlation satisﬁes:

−1 ≤ ρ(X, Y ) ≤ 1.
Theorem 6. V (X + Y ) = V (X) + V (Y ) + 2Cov(X, Y ) and V (X − Y ) =
V (X) + V (Y ) − 2Cov(X, Y ). More generally, for r.vs X1, . . . , Xn,

V aiXi = a2V (Xi) + 2
i aiaj Cov(Xi, Xj ).
i i i<j


Variance-Covariance Matrix Σ

If the random vector X and the mean µ are deﬁned by
     
X1 µ1 E(X1)
X= .  µ=
. . =
. .
. 
Xk µk E(Xk )
then the variance-covariance matrix Σ is deﬁned to be

 
V (X1) Cov(X1, X2) · · · Cov(X1, Xk )
 Cov(X2, X1) V (X2) ··· Cov(X2, Xk ) 
V (X) = 
 . . ... . 
. . . 
Cov(Xk , X1) Cov(Xk , X2) · · · V (Xk )


Theorem 7. If a is a vector and X is a random vector with mean µ and
variance Σ then E(aT X) = aT µ and V (aT X) = aT Σa. If A is a matrix
then E(AX) = Aµ and V (AX) = AΣAT


• E(X) is a number but E(X|Y = y) is a function of y.

• Before observing Y , we don’t know the value of E(X|Y = y) so it is a
r.v which we denote E(X|Y ).

• E(X|Y ) is the r.v whose value is E(X|Y = y) when Y = y.
Theorem 8. For r.vs X and Y , assuming the expectations exist, we have
that

E[E(Y |X)] = E(Y ) and E[E(X|Y )] = E(X)

More generally, for any function r(x, y) we have

E[E(r(X, Y )|X)] = E(r(X, Y )) and E[E(r(X, Y )|Y )] = E(r(X, Y ))


Deﬁnition 5. The conditional variance is deﬁned as

V (Y |X = x) = (y − µ(x))2f (y|x)dy

where µ(x) = E(Y |X = x).

Theorem 9. For r.vs X and Y ,

V (Y ) = EV (Y |X) + V E(Y |X).

Example 11. Compute the variance of a binomial random variable X
with parameters n and p.

Example 12. If X and Y are independent random variables both uniformly
distributed on (0, 1), then calculate the probability density of X + Y .


Example 13. let X and Y be independent Poisson random variables with
respective means λ1 and λ2. Calculate the distribution of X + Y


Intro probability 3

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Intro probability 3

Ähnlich wie Intro probability 3 (20)

Intro probability 3