1. Probability Theory
Expectation
Phong VO
vdphong@fit.hcmus.edu.vn
September 11, 2010
– Typeset by FoilTEX –
2. Expectation of a Random Variable
Definition 1. The expected value, or mean, or first moment, of X is
defined to be
x xf (x) if X is discrete
E(X) = xdF (x) =
xf (x)dx if X is continuous
assuming that the sum (or integral) is well-defined. We use the following
notation to denote the expected value of X:
E(X) = EX = xdF (x) = µ = µX
– Typeset by FoilTEX – 1
3. • Think of E(X) as the average value you would obtain if you computed
n
the numerical average n−1 i=1 Xi of a large number of IID draws
X1, . . . , Xn.
Example 1. Find E[X] where X is the outcome when we roll a fair dice.
Example 2. Calculate E(X) when X is a Bernoulli random variable with
parameter p.
Example 3. Calculate E(X) when X is binomially distributed with
parameters n and p.
Example 4. Calculate the expectation of a geometric random variable
having parameter p.
Example 5. Calculate E(X) if X is a Poisson random variable with
parameter λ.
– Typeset by FoilTEX – 2
4. Example 6. Calculate the expectation of a random variable uniformly
distributed over (α, β)
Example 7. Let X be exponentially distributed with parameter λ.
Calculate E(X).
Example 8. Calculate E(X) when X is normally distributed with
parameters µ and σ 2
– Typeset by FoilTEX – 3
5. Expectation of a Function of a Random Variable
• If X is a discrete random variable with probability mass function p(x),
then for any real-valued function g,
E[g(X)] = g(x)p(x)
x:p(x)>0
• If X is a continuous random variable with probability density function
f (x), then for any real-valued function g,
∞
E[g(X)] = g(x)f (x)dx
−∞
– Typeset by FoilTEX – 4
6. Example 9. Let (X, Y ) have a jointly uniform distribution on the unit
square. Let Z = r(X, Y ) = X 2 + Y 2. Then,
E(Z) = r(x, y)dF (x, y) (1)
1 1
= (x2 + y 2)dxdy (2)
0 0
1 1
= x2dx + y 2dy (3)
0 0
2
= (4)
3
– Typeset by FoilTEX – 5
7. Properties of Expectations
Theorem 1. If X1, . . . , Xn are r.vs and a1, . . . , an are constants, then
E aiXi = a + iE(Xi)
i i
Theorem 2. Let X1, . . . , Xn be independent r.vs. Then,
n
E Xi = aiE(Xi).
i=1 i
– Typeset by FoilTEX – 6
8. Example 10. Let X ∼ Binomial(n, p). What is the mean of X? We
could try to appeal to the definition:
n
n
E(X) = xdFX (x) = xfX (x) = x px(1 − p)n−x
x
x x=0
n
but this is not an easy way to evaluate. Instead, note that X = i=1 Xi
where Xi = 1 if the ith toss is heads and Xi = 0 otherwise. Then E(Xi) =
(p × 1) + ((1 − p) × 0) = p and E(X) = E( i Xi) = i E(Xi) = np.
– Typeset by FoilTEX – 7
9. Variance and Covariance
Definition 2. Let X be a r.v with mean µ. The variance of X, denoted
by σ 2 or σX or V (X) or V X, is defined by
2
σ 2 = E(X − µ)2 = (x − µ)2dF (x)
assuming this expectation exists. The standard deviation is sd(X) =
V (X) and is also denoted by σ and σX .
– Typeset by FoilTEX – 8
10. Theorem 3. Assuming the variance is well defined, it has the following
properties:
1. V (X) = E(X 2) − µ2.
2. If a and b are constants then V (aX + b) = a2V (X).
3. If X1, . . . , Xn are independent and a1, . . . , an are constants, then
n n
V aiXi = a2V (X + i).
i
i=1 i=1
– Typeset by FoilTEX – 9
11. • If X1, . . . , Xn are r.vs then we define the sample mean to be
n
1
Xn = Xi
n i=1
and the sample variance to be
n
2 1
Sn = (Xi − Xn)2.
n − 1 i=1
– Typeset by FoilTEX – 10
12. Theorem 4. Let X1, . . . , Xn be IID and let µ = E(Xi), σ 2 = V (Xi).
Then
σ2
E(Xn = µ, V (Xn) = and E(Sn) = σ 2.
2
n
– Typeset by FoilTEX – 11
13. Definition 3. Let X and Y be r.vs. with means µX and µY and standard
deviations σX and σY . Define the covrariance between X and Y by
Cov(X, Y ) = E[(X − µX )(Y − µY )]
and the correlation by
Cov(X, Y )
ρ = ρX,Y = ρ(X, Y ) =
σX σY
– Typeset by FoilTEX – 12
14. Theorem 5. The covariance satisfies:
Cov(X, Y ) = E(XY ) − E(X)E(Y ).
The correlation satisfies:
−1 ≤ ρ(X, Y ) ≤ 1.
Theorem 6. V (X + Y ) = V (X) + V (Y ) + 2Cov(X, Y ) and V (X − Y ) =
V (X) + V (Y ) − 2Cov(X, Y ). More generally, for r.vs X1, . . . , Xn,
V aiXi = a2V (Xi) + 2
i aiaj Cov(Xi, Xj ).
i i i<j
– Typeset by FoilTEX – 13
15. Variance-Covariance Matrix Σ
If the random vector X and the mean µ are defined by
X1 µ1 E(X1)
X= . µ=
. . =
. .
.
Xk µk E(Xk )
then the variance-covariance matrix Σ is defined to be
V (X1) Cov(X1, X2) · · · Cov(X1, Xk )
Cov(X2, X1) V (X2) ··· Cov(X2, Xk )
V (X) =
. . ... .
. . .
Cov(Xk , X1) Cov(Xk , X2) · · · V (Xk )
– Typeset by FoilTEX – 14
16. Theorem 7. If a is a vector and X is a random vector with mean µ and
variance Σ then E(aT X) = aT µ and V (aT X) = aT Σa. If A is a matrix
then E(AX) = Aµ and V (AX) = AΣAT
– Typeset by FoilTEX – 15
17. Conditional Expectation
Definition 4. The conditional expectation of X given Y = y is
xfX|Y (x|y)dx discrete case
E(X|Y = y) =
xfX|Y (x|y)dx continuous case
If r(x, y) is a function of x and y then
r(x, y)fX|Y (x|y)dx discrete case
E(r(X, Y )|Y = y) =
r(x, y)fX|Y (x|y)dx continuous case
– Typeset by FoilTEX – 16
18. • E(X) is a number but E(X|Y = y) is a function of y.
• Before observing Y , we don’t know the value of E(X|Y = y) so it is a
r.v which we denote E(X|Y ).
• E(X|Y ) is the r.v whose value is E(X|Y = y) when Y = y.
Theorem 8. For r.vs X and Y , assuming the expectations exist, we have
that
E[E(Y |X)] = E(Y ) and E[E(X|Y )] = E(X)
More generally, for any function r(x, y) we have
E[E(r(X, Y )|X)] = E(r(X, Y )) and E[E(r(X, Y )|Y )] = E(r(X, Y ))
– Typeset by FoilTEX – 17
19. Definition 5. The conditional variance is defined as
V (Y |X = x) = (y − µ(x))2f (y|x)dy
where µ(x) = E(Y |X = x).
Theorem 9. For r.vs X and Y ,
V (Y ) = EV (Y |X) + V E(Y |X).
Example 11. Compute the variance of a binomial random variable X
with parameters n and p.
Example 12. If X and Y are independent random variables both uniformly
distributed on (0, 1), then calculate the probability density of X + Y .
– Typeset by FoilTEX – 18
20. Example 13. let X and Y be independent Poisson random variables with
respective means λ1 and λ2. Calculate the distribution of X + Y
– Typeset by FoilTEX – 19