Game Theory Lesson 34

Lesson 34 (KH, Section 11.4)
Introduction to Game Theory

Math 20

December 12, 2007

Announcements
Pset 12 due December 17 (last day of class)
next OH today 1–3 (SC 323)

Outline
Games and payoﬀs
Matching dice
Vaccination

The theorem of the day

Strictly determined games
Example: Network programming
Characteristics of an Equlibrium

Two-by-two strictly-determined games

Two-by-two non-strictly-determined games
Calculation
Example: Vaccination

Other

A Game of Chance

You and I each have a
six-sided die
We roll and the loser
pays the winner the
diﬀerence in the numbers
shown
If we play this a number
of times, who’s going to
win?

The Payoﬀ Matrix

Lists each player’s
outcomes versus C ’s outcomes
the other’s 1 2 3 4 5 6
1 0 -1 -2 -3 -4 -5
Each aij represents

R’s outcomes
2 1 0 -1 -2 -3 -4
the payoﬀ from C
3 2 1 0 -1 -2 -3
to R if outcomes i
4 3 2 1 0 -1 -2
for R and j for C
5 4 3 2 1 0 -1
occur (a zero-sum
6 5 4 3 2 1 0
game).

Expected Value
Let the probabilities of R’s outcomes and C ’s outcomes be
given by probability vectors
 
q1
 q2 
p = p1 p2 · · · pn q=.
 
..
qn

Expected Value
 
q1
 q2 
p = p1 p2 · · · pn q=.
 
..
qn

The probability of R having outcome i and C having outcome
j is therefore pi qj .

Expected Value
 
q1
 q2 
p = p1 p2 · · · pn q=.
 
..
qn

The expected value of R’s payoﬀ is
n
E (p, q) = pi aij qj = pAq
i,j=1

Expected Value
 
q1
 q2 
p = p1 p2 · · · pn q=.
 
..
qn

The expected value of R’s payoﬀ is
n
E (p, q) = pi aij qj = pAq
i,j=1

A “fair game” if the dice are fair.

Expected value of this game

pAq
0 −1 −2 −3 −4 −5
  
1/6
1 0
 −1 −2 −3 −4  1/6
 
1/6 1/6 1/6 1/6 1/6
2 1
1/6 
0 −1 −2 −3  1/6
=  
3 2
 1 0 −1 −2  1/6
 
4 3 2 1 0 −1  1/6
5 4 3 2 1 0 1/6

−15/6
 
 −9/6 
 
 −3/6 
= 1/6 1/6 1/6 1/6 1/6 1/6  
 3/6 
 
 9/6 
15/6

=0

Expected value with an unfair die
Suppose p = 1/10 1/10 1/5 1/5 1/5 1/5 . Then

pAq
0 −1 −2 −3 −4 −5
 
1/6
1 0 −1 −2 −3 −4  1/6
 
1/10 1/10 1/5 1/5 1/5
2 1
1/5 
0 −1 −2 −3  1/6
= 
3 2
 1 0 −1 −2  1/6

4 3 2 1 0 −1  1/6
5 4 3 2 1 0 1/6

−15
 
 −9 
 
1 1
 −3  24 2
= 10 · 6 1 1 2 2 2 2 
 =
 60 = 5
 3 
 9 
15

Strategies

What if we could
choose a die to be
as biased as we C ’s outcomes
wanted? 1 2 3 4 5 6
1 0 -1 -2 -3 -4 -5
In other words,

R’s outcomes
2 1 0 -1 -2 -3 -4
what if we could
3 2 1 0 -1 -2 -3
choose a strategy
4 3 2 1 0 -1 -2
p for this game?
5 4 3 2 1 0 -1
Clearly, we’d want 6 5 4 3 2 1 0
to get a 6 all the
time!

Flu Vaccination

Suppose there are two ﬂu
strains, and we have two
ﬂu vaccines to combat
them.
We don’t know
distribution of strains Strain
Neither pure strategy is 1 2

Vacc
the clear favorite 1 0.85 0.70
Is there a combination of 2 0.60 0.90
vaccines (a mixed
strategy) that
maximizes total
immunity of the
population?

Theorem (Fundamental Theorem of Zero-Sum Games)
There exist optimal strategies p∗ for R and q∗ for C such that for
all strategies p and q:

E (p∗ , q) ≥ E (p∗ , q∗ ) ≥ E (p, q∗ )

Theorem (Fundamental Theorem of Zero-Sum Games)
There exist optimal strategies p∗ for R and q∗ for C such that for
all strategies p and q:

E (p∗ , q) ≥ E (p∗ , q∗ ) ≥ E (p, q∗ )

E (p∗ , q∗ ) is called the value v of the game.

Reﬂect on the inequality

E (p∗ , q) ≥ E (p∗ , q∗ ) ≥ E (p, q∗ )
In other words,
E (p∗ , q) ≥ E (p∗ , q∗ ): R can guarantee a lower bound on
his/her payoﬀ
E (p∗ , q∗ ) ≥ E (p, q∗ ): C can guarantee an upper bound on
how much he/she loses
This value could be negative in which case C has the
advantage

Fundamental problem of zero-sum games

Find the p∗ and q∗ !
The general case we’ll look at next time (hard-ish)
There are some games in which we can ﬁnd optimal strategies
now:
Strictly-determined games
2 × 2 non-strictly-determined games

Example: Network programming

Suppose we have two
networks, NBC and CBS
Each chooses which
program to show in a
certain time slot
Viewer share varies
depending on these
combinations
How can NBC get the
most viewers?

The payoﬀ matrix and strategies
CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
My Name is Earl 60 20 30 55
NBC

Dateline 50 75 45 60
Law & Order 70 45 35 30

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

What is NBC’s strategy?

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

NBC wants to maximize NBC’s minimum share

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

In airing Dateline, NBC’s share is at least 45

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

In airing Dateline, NBC’s share is at least 45
This is a good strategy for NBC

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

What is CBS’s strategy?

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

CBS wants to minimize NBC’s maximum share

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

In airing CSI, CBS keeps NBC’s share no bigger than 45

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

In airing CSI, CBS keeps NBC’s share no bigger than 45
This is a good strategy for CBS

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

Equilibrium

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

Equilibrium
(Dateline,CSI) is an equilibrium pair of strategies

CBS

es

r
ut

ea
CS r
ivo
in

D
M

rv

s,
I
Ye
Su
60
NBC

Law & Order 70 45 35 30

Equilibrium
(Dateline,CSI) is an equilibrium pair of strategies
Assuming NBC airs Dateline, CBS’s best choice is to air CSI,
and vice versa

Characteristics of an Equlibrium

Let A be a payoff matrix. A saddle point is an entry ars
which is the minimum entry in its row and the maximum
entry in its column.
A game whose payoff matrix has a saddle point is called
strictly determined
Payoff matrices can have multiple saddle points

Pure Strategies are optimal in Strictly-Determined Games
Theorem
Let A be a payoﬀ matrix. If ars is a saddle point, then er is an
optimal strategy for R and es is an optimal strategy for C.

Pure Strategies are optimal in Strictly-Determined Games
Theorem
Let A be a payoﬀ matrix. If ars is a saddle point, then er is an
optimal strategy for R and es is an optimal strategy for C.
Proof.
If q is a strategy for C, then
n n
E (er , q) = er Aq = arj qj ≥ ars qj = ars = E (er , es )
j=1 j=1

If p is a strategy for R, then
m m
E (er , es ) = pAes = pi ais ≤ pi ars = E (er , es )
i=1 i=1

So for any p and q, we have

E (er , q) ≥ E (er , es ) ≥ E (er , es )

Finding equilibria by gravity

If C chose strategy 2,
and R knew it, R would  
deﬁnitely choose 2  1 3 
 
This would make C 



choose strategy 1 



but (2, 1) is an
 
2 4
equilibrium, a saddle
point.


 
Here (1, 1) is an equilibrium  2 3 
position; starting from there 



neither player would want to
 
 
deviate from this.
 
 
1 4


 
 2 3 
 
What about this one? 



 
 
 
4 1

Calculation

In this case we can compute E (p, q) by hand in terms of p1 and q1 :

E (p, q) = p1 a11 q1 +p1 a12 (1−q1 )+(1−p1 )a21 q1 +(1−p1 )a22 (1−q1 )

Calculation



The critical points are when
∂E
0= = a11 q1 + a12 (1 − q1 ) − a21 q1 − a22 (1 − q1 )
∂p1
∂E
0= = p1 a11 − p1 a12 + (1 − p1 )a21 − (1 − p1 )a22
∂q

Calculation



The critical points are when
∂E
0= = a11 q1 + a12 (1 − q1 ) − a21 q1 − a22 (1 − q1 )
∂p1
∂E
0= = p1 a11 − p1 a12 + (1 − p1 )a21 − (1 − p1 )a22
∂q
So
a22 − a12 a22 − a21
q1 = p1 =
a11 + a22 − a21 − a12 a11 + a22 − a21 − a12
These are in between 0 and 1 if there are no saddle points in the
matrix.

Examples

1 3
If A = , then p1 = 2 ? Doesn’t work because A has a
0
2 4
saddle point.
2 3
If A = , p1 = 3 ? Again, doesn’t work.
2
1 4
2 3
If A = , p1 = −3 = 3/4, while q1 = −4 = 1/2. So R
−4
−2
4 1
should pick 1 half the time and 2 the other half, while C
should pick 1 3/4 of the time and 2 the rest.

Further Calculations

Also
∂2E ∂2E
=0 =0
∂p 2 ∂q 2
So this is a saddle point!
Finally,
a11 a22 − a12 a21
E (p, q) =
a11 + a22 − a21 − a22

Example: Vaccination

We have
0.9 − 0.6 2
p1 = = Strain
0.85 + 0.9 − 0.6 − 0.7 3
0.9 − 0.7 4 1 2
q1 = =

Vacc
0.85 + 0.9 − 0.6 − 0.7 9 1 0.85 0.70
(0.85)(0.9) − (0.6)(0.7) 2 0.60 0.90
v= ≈ 0.767
0.85 + 0.9 − 0.6 − 0.7
We should give 2/3 of the population vaccine 1 and the rest
vacine 2
The worst case scenario is a 4 : 5 distribution of strains
We’ll still cover 76.7% of the population

Other Applications of GT

War
the Battle of the
Bismarck Sea
Business
product introduction
pricing
Dating

Game Theory Lesson 34

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (19)

Mehr von Matthew Leingang

Mehr von Matthew Leingang (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Game Theory Lesson 34