The Minority Game: Individual and Social Learning

The Minority Game:
Individual and Social Learning
under the supervision of
Dr. Matthijs Ruijgrok
Stathis Grigoropoulos
MSc Mathematics
Scientiﬁc Computing
March 27, 2014
1 / 51

Overview
1 Introduction
2 Evolutionary Game Theory (EGT)
Rationality
Learning
EGT - Nash Equilibrium
3 Revision Protocol
4 Minority Game
Stage Game
Congestion Game
Individual Learning
Social Learning
5 Conclusion
2 / 51

Introduction
• Learning has been given much attention in Artiﬁcial
Intelligence (AI) and Game Theory (GT) disciplines, as it is
the key to intelligent and rational behavior.
• What type of learning processes can we identify in a context
with many interacting agents?
• Mathematics to model learning?
• What do they learn to play? learning outcomes?
3 / 51

EGT - Rationality
• Classic Game Theory (GT) assumes perfect rationality.
• The players know all the details of the game, including each
other’s preferences over all possible outcomes.
• The richness and depth of the analysis is tremendous.
• It can handle only a small number of heterogeneous agents.
• Humans do not always play rationally
• Lack of game details, player information.
• Follow “rules of thumb”.
4 / 51

EGT - Rationality
• How do you choose the route back home to avoid
congestion?
5 / 51

EGT - Learning
• Evolutionary Game Theory (EGT), motivated by biology,
imagines the game to be repeatedly played by bounded
rational players randomly drawn from a population.
• The players make no assumption on the other players
strategies.
• Each agent has a predeﬁned strategy of the game(like a “rule
of thumb”).
• Agents have the opportunity to adapt through an evolutionary
process over time and change their behavior.
• Can be reproduction or imitation ⇒ Learning!.
6 / 51

EGT - Learning
• Individual Learning
• In individual learning, success and failure directly inﬂuence
agent choices and behavior.
• From a AI learning perspective, individual learning is
interpreted as various types of reinforcement learning.
7 / 51

EGT - Learning
• Social Learning
• Social Learning occurs in the cases where success and failure
of other players inﬂuence choice probabilities.
• Social Learning ⇒ Social Imitation.
8 / 51

• Thankfully, we still have the Nash Equilibrium.
9 / 51

• We consider a game played by a single population, where
agents play equivalent roles.
• Let there be N players, each of whom takes a pure strategy
from the set S = {1 . . . n}.
• We call population state x the element of the simplex
X = {x ∈ Rn
+ : j∈S xj = 1}, with xj the fraction of agents
playing strategy j.
• A population game is identified by a continuous vector-valued
function that maps population states to payoffs, i.e
F : X → Rn.
• The payoff of strategy i when population state is x, is
described by the scalar Fi (x).
10 / 51

Population state x∗ is a Nash Equilibrium of F, when no agent can
benefit and improve his profit by switching unilaterally from
strategy i to strategy j. Specifically, x∗ is a NE if and only if:
Fi (x∗
) ≥ Fj (x∗
) ∀j ∈ S
∀ i ∈ S s.t.
x∗
i > 0.
11 / 51

Revision Protocol
• In population games introduced, agents are matched randomly
and play their strategies, producing their respected payoffs.
• However, population games can also embody cases, where all
the players take part in the game.
• The foundations of population learning dynamics are built
upon a notion called revision protocol.
• How individuals choose to update their strategy.
Definition
A revision protocol is a map ρ : Rn × X → Rn×n that takes as
input payoff vectors π and population states x and returns
non-negative matrices as output.
12 / 51

Revision Protocol - Social Learning
• Commonly, Social Imitation is modeled through the revision
protocol called proportional imitation protocol
ρij (π, x) = xj [πj − πi ]+ [7].
• Simply, imitate ‘the ﬁrst man you meet on the street” with a
probability proportional to your score diﬀerence.
• This protocol generates the well-studied replicator dynamic
˙xi = xi
ˆFi (x), (1)
with ˆFi (x) = Fi (x) − ¯F(x) and ¯F(x) = i∈S xi Fi (x) [7].
13 / 51

Revision Protocol - Replicator Equation
• The replicator equation is a famous deterministic dynamic in
EGT.
• The percentage growth rate ˙xi /xi of each strategy in use, is
equal to its excess payoﬀ.
• The attracting rest points of the replicator equation are Nash
Equilibrium (NE) of the game [9].
• Bounded Rational Agents discover NE through learning.
14 / 51

Revision Protocol - Individual Learning
• Suppose we have n populations playing an n-player game.
• A random agent from each population is selected and they
play the game at each round.
• The pairwise proportional imitation revision protocol,
• ρhk(π, xi ) = xi
k[πk − πh]+, for each population i ∈ N with
population state xi .
• We get the multi-population replicator dynamics [7].
˙xi
h = xi
h
ˆFh(xi
), ∀h ∈ S, ∀i ∈ N. (2)
15 / 51

Revision Protocol - Individual Learning
• Each population represents a player.
• Each agent within a population represents a voice or opinion
inside each the player.
• Through the imitation process the voices achieving higher
scores gets reinforced ⇒ Reinforcement Learning.
16 / 51

Minority Game
• The Minority Game is a famous Congestion Game.
• Congestion games model instances when the payoﬀ of each
player depends on the choice of resources along with the
number of players choosing the same resource [5].
• Route Choice in road network
• Selﬁsh packet routing in the Internet
• Congestion games are Potential Games.
• There exist a single scalar-valued function that characterizes
the game [6, p. 53].
17 / 51

Minority Game
• Consider an odd population of N agents competing in a
repeated one-shot game (N = 2k + 1, k ≥ 1) where
communication is not allowed.
• At each time step (round) t of the game, every agent has to
choose between one of two possible actions, either
αi (t) ∈ {−1, 1}.
• The minority choice wins the round at each time step and all
the winning agents are rewarded one point, none otherwise.
• By construction, the MG is a negative sum game, as the
winners are always less than the losers.
18 / 51

Minority Game
Do Individual and Social Learning lead to diﬀerent learning
outcomes?
19 / 51

Minority Game - Stage Game
Definition
Define the Minority stage game as the one shot strategic game
Γ =< N, ∆, πi > consisting of:
• N players indexed by
i ∈ {1, . . . , 2k + 1}, k ∈ N, N = {1, . . . , 2k + 1},
• a finite set of strategies Ai = {−1, 1} indexed by α , where αi
denotes the strategy of player i,
• the set of mixed strategies of player i is denoted by ∆(Ai ) and
• a payoff function πi : αi × α−i → R = {0, 1}, where
α−i = i=j αj . More formally,
πi =
1 if − αi
N
j=1 αj ≥ 0
0 otherwise
(3)
20 / 51

• Some Details...
• We say agent i has mixed strategy αi :
• We think of a probability p ∈ [0, 1] that agent i plays action
{1}
• (1 − p) to play action {−1}
• Pure strategy : p = 0, 1.
• Mixed Strategy Proﬁle α:
• All agents Mixed Strategies in a vector ⇒ Population State.
• The agents playing a mixed strategy are called mixers.
21 / 51

• “inversion” symmetry πi (−αi , −α−i ) = πi (αi , α−i )
⇒ two actions are a priori equivalent [2].
We have three general types of Nash equilibria, namely:
• Pure Strategy Nash Equilibria. I.e., when all players play a
pure strategy.
• Symmetric Mixed Strategy Nash Equilibria. That is, the
agents choose the same mixed strategy to play.
• Asymmetric Mixed Strategy Nash Equilibria. Speciﬁcally, the
NE when some players choose a pure strategy and the rest a
mixed strategy.
22 / 51

Proposition
The number of pure strategy Nash Equilibria in the stage game of
the original MG is 2 N
N−1
2
.
Proof.
The ﬁrst part is the number of ways N
2 − 1 diﬀerent players can be
chosen out of the set of N players at a time with minority side “0”.
Similarly, for the chosen set of players that are labeled with
winning side “1”. ♦
23 / 51

Lemma
Let be α ∈ ×i∈N∆(Ai ) a Nash equilibrium with a non-empty set of
mixers. Then all mixers use the same mixed strategy [10, 1].
Asymmetric Mixed Strategy Nash Equilibria.
• with one mixer.
• with more than one mixer.
Unique Symmetric Mixed NE
• All players choose one the two actions with probability
p = 1/2
24 / 51

Minority Game - Congestion Game
The two available choices to the agents can represent two distinct
resources in a Congestion Game.
A congestion model (N, M, (Ai )i∈N, (cj )j∈M) is described as
follows:
• N the number of players.
• M = {1..m} the number of resources.
• Ai the set of strategies of player i, where each ai ∈ Ai is a
non empty set of resources.
• For j ∈ M, cj ∈ Rn denotes the vector of benefits, where cjk is
the cost (e.g cost or payoff ) related to each user of resource
j, if there are exactly k players using that resource.
• A population state is a vector a = α = (α1, ..., α2k+1) or
point in the polyhedron ∆(A) of the mixed strategy profiles
25 / 51

Minority Game - Congestion Game
• The overall cost function for player i [8, p. 174]:
Ci =
j∈ai
cj (σj (a)) = −πi (a). (4)
Deﬁnition
A function P : A → R is a potential of the game G if
∀a ∈ A, ∀ai , bi ∈ Ai
ui (bi , a−i ) − ui (ai , a−i ) = P(bi , a−i ) − P(ai , a−i ) [3].
• Skipping a lot of nice math...
Proposition
An exact potential of the MG is the sum of the payoﬀs of all the
N = 2h + 1 ∈ R players. Therefore,
P(a) =
N
i=1
ui (a) ∀a ∈ A (5)
26 / 51

Minority Game - Individual Learning
• Let N = {1, ..., 2k + 1} be a set of populations, with each
population representing a Player i in the MG.
• A population state is a vector a = α = (α1, ..., α2k+1) or
point in the polyhedron ∆(A) of the mixed strategy proﬁles.
• Each component αi is a point in the simplex ∆(Ai ), denoting
the proportion of agents programmed to play the pure
strategy ai ∈ Ai .
• Agents – one from each population – are continuously drawn
uniformly at random from these populations to play the MG.
• Time indexed t, is continuous.
∀i ∈ N, ∀αi ∈ Ai :
˙αi (ai ) = αi (ai )(ui (ai , α−i ) − ui (αi , α−i )). (6)
27 / 51

• A population state α ∈ ×i∈N∆(Ai ) is a stationary state of the
replicator dynamics (6),
when ˙αi (ai ) = 0 ∀i ∈ N, ∀αi ∈ Ai .
• The set T of stationary states can be partitioned into three
subsets [1].
T1 : The connected set of Nash equilibria with at most one mixer,
T2 : Nash equilibria with more than one mixer and
T3 : non-equilibrium proﬁles of the type (l, r, λ).



l, r ∈ 2k + 1,
l + r ≤ 2k + 1,
if l + r < 2k + 1 then λ ∈ (0, 1).
28 / 51

• Lyapunov stability
• Asymptotic stability
29 / 51

Concretely,
• A population state α ∈ ×i∈N∆(Ai ) is Lyapunov stable if every
neighborhood B of α contains a neighborhood B0 of α such
that ψ(t, a0) ∈ B for every x0 ∈ B ∩ ×i∈N∆(Ai ) and t ≥ 0.
• A stationary state is asymptotically stable if it is Lyapunov
stable, and, in addition, there exists a neighborhood B∗, with
limt→∞ ψ(t, a0) = α ∀α0 ∈ B∗ ∩ ×i∈N∆(Ai ).
30 / 51

Using deﬁnition (4) of the potential function in equations (6):
∀i ∈ N, ∀αi ∈ Ai :
˙αi (ai ) = αi (ai )(U(ai , α−i ) − U(αi , α−i )). (7)
Proposition
The potential function U of the minority game is a Lyapunov
function for the replicator dynamic: for each trajectory
(α(t))t∈[0,∞], we have dU
dt ≥ 0. Equality holds at the stationary
states [1].
Proposition
The collection of Nash equilibria with at most one mixer in T1 is
asymptotically stable under the replicator dynamics. Moreover,
stationary states in T2 and T3 are not Lyapunov stable [1].
31 / 51

• The Nash Equilibria with one mixer, are global maxima of the
potential function U.
• The potential function is the sum of the utilities of the
players. ⇒ Maximization of Utility.
• The symmetric NE of the MG is not Lyapunov stable.
32 / 51

Three Players in the MG:
We define x, y, z ∈ R[0, 1] the probabilities for each player to play
strategy ai = 1.
A3 -1
A1/A2 -1 1
-1 000 010
1 100 001
A3 1
A1/A2 -1 1
-1 100 100
1 010 000
Table: Payoff matrix of the three-player MG. A1,A2 and A3 denote
agents 1,2,3 respectively with actions {−1, 1}. The utility matrix is split
into two submatrices using agent A3 actions as a divider. The payoffs for
each agent are presented w.r.t to their number, i.e. payoff 010 means
payoff for A1=0, A2=1 and A3=0.
33 / 51

U(x, y, z) = u1 + u2 + u3 = x + y + z − xy − yz − zx (8)
max U(x, y, z) = 1 (9)
34 / 51

35 / 51

• Individual Learning is a “utilitarian” solution.
• The sum of the utilities is a Lyapunov function and is
maximized in the replicator dynamics.
• The result can be generalized to congestion games.
• What about Social Learning?
36 / 51

Minority Game - Social Learning
• No Mathematical model was attempted.
• Agent-Based Simulation were performed.
• Netlogo - Java [11],[4].
37 / 51

We deﬁne
• Attendance
Att(t) =
N
i=1
αi (t) (10)
• |Att(t)| = 1 ⇒ Nash Equilibrium!
38 / 51

Figure: The algorithm
39 / 51

Parameter Value
Agents 99,
Imitators 3
Review Rounds 3
Table: Parameters for the experiment using a Single Population of Players
40 / 51

• Not a Nash Equilibrium!
41 / 51

Figure: Histogram of strategies at stationary state,N = 99.
42 / 51

• Strategies always come as pairs.
43 / 51

44 / 51

Minority Game - Social vs Individual
Learning
45 / 51

Minority Game - Social vs Individual
Learning
46 / 51

Conclusion
• Yes! Different Learning Algorithms, do lead to different
outcomes.
• Individual Learning in MG is robust and efficient, cares about
maximization of utility.
• Individual Learning can lead to great differences in agent
performance.
• Social Learning is a more “egalitarian” solution, agents are
equal in terms of score.
• Social Learning does not converge to a Nash Equilibrium.
• Optimization ⇒ Individual Learning.
• Social Analysis ⇒ Social Learning.
47 / 51

W. Kets and M. Voorneveld.
Congestion, Equilibrium and Learning: the Minority Game.
Discussion paper. Tilburg University, 2007.
M. Marsili, D. Challet, and R. Zecchina.
Exact solution of a modiﬁed El Farol’s bar problem: Eﬃciency
and the role of market impact.
Physica A Statistical Mechanics and its Applications,
280:522–553, June 2000.
Dov Monderer and Lloyd S Shapley.
Potential games.
Games and economic behavior, 14(1):124–143, 1996.
Steven F Railsback, Steven L Lytinen, and Stephen K Jackson.
Agent-based simulation platforms: Review and development
recommendations.
Simulation, 82(9):609–623, 2006.
RobertW. Rosenthal.
49 / 51

A class of games possessing pure-strategy nash equilibria.
International Journal of Game Theory, 2(1):65–67, 1973.
W.H. Sandholm.
Population Games and Evolutionary Dynamics.
Economic Learning and Social Evolution. MIT Press, 2010.
William H Sandholm.
Stochastic evolutionary game dynamics: Foundations,
deterministic approximation, and equilibrium selection.
American Mathematical Society, 201(1):1–1, 2010.
Yoav Shoham and Kevin Leyton-Brown.
Multiagent systems: Algorithmic, game-theoretic, and logical
foundations.
Cambridge University Press, 2009.
J.W. Weibull.
Evolutionary Game Theory.
MIT Press, 1997.
Duncan Whitehead.
50 / 51

The El Farol Bar Problem Revisited: Reinforcement Learning
in a Potential Game.
ESE Discussion Papers 186, Edinburgh School of Economics,
University of Edinburgh, September 2008.
U. Wilensky.
Netlogo. http://ccl.northwestern.edu/netlogo/. center for
connected learning and computer-based modeling,
northwestern university, evanston, il., 1999.
51 / 51

The Minority Game: Individual and Social Learning

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie The Minority Game: Individual and Social Learning

Ähnlich wie The Minority Game: Individual and Social Learning (20)

Mehr von Stathis Grigoropoulos

Mehr von Stathis Grigoropoulos (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

The Minority Game: Individual and Social Learning