Artificial Intelligence and Optimization with Parallelism
1. HABILITATION
Artificial intelligence
with Parallelism
Acknowledgments:
All the TAO team. People in Liège, Taiwan, Lri,Artelys, Mash, Iomca, ..,
Thanks a lot to the committee.
Thanks + good recovery to Jonathan Shapiro.
Thanks to Grid5000.
Olivier Teytaud olivier.teytaud@inria.fr
2. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
3. AI = using computers where they
are weak / weaker than humans.
(thanks Michèle S.)
Difficult optimization (complex structure,
noisy objective functions)
Games (difficult ones)
Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
(very) approximately solve it
OR = choosing the best model that you can solve almost exactly
4. AI = using computers where they
are weak / weaker than humans.
(thanks Michèle S.)
Difficult optimization (complex structure,
noisy objective functions)
Games (difficult ones)
Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
(very) approximately solve it
OR = choosing the best model that you can solve almost exactly
5. AI = using computers where they
are weak / weaker than humans.
(thanks Michèle S.)
Difficult optimization (complex structure,
noisy objective functions)
Games (difficult ones)
Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
(very) approximately solve it
OR = choosing the best model that you can solve almost exactly
6. AI = using computers where they
are weak / weaker than humans.
(thanks Michèle S.)
Difficult optimization (complex structure,
noisy objective functions)
Games (difficult ones)
Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
(very) approximately solve it
OR = choosing the best model that you can solve almost exactly
7. Many works are about numbers.
Providing standard deviations, rates, etc.
Other goal (more ambitious ?):
switching from something which does not work
to something which works.
E.g. vision; a computer can distinguish:
9. And it's a disaster for categorizing
- children,
- women,
- panda,
- babies,
- children
- men,
- bears,
- trucks,
- cars.
10. And it's a disaster for categorizing children,
women, panda, babies, children, men, bears, trucks, cars.
11. And it's a disaster for categorizing children,
women, panda, babies, children, men, bears, trucks, cars.
3 years old;
she can do it.
12. ==> AI= focus on things which do not
work and (hopefully) make them work.
13. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
14. Evolutionary optimization is a part of A.I.
Often considered as bad, because many
EO tools are not that hard,
mathematically speaking.
I've met people using
- randomized mutations
- cross-overs
but who did not call this evolutionary or
genetic, because it would be bad.
15. Gives a lot freedom:
- choose your operators (depending on the problem)
- choose your population-size (depending on your
computer/grid )
- choose (carefully) e.g. min(dimension, /4)
==> Can work on strange domains
19. Voronoi representation:
- a family of points
- their labels
==> cross-over makes sense
==> you can optimize a shape
20. Voronoi representation:
- a family of points
- their labels
==> cross-over makes sense
==> you can optimize a shape
21. Voronoi representation:
- a family of points
- their labels
==> cross-over makes sense
==> you can optimize a shape
Great substitute for
averaging.
“on the benefit of sex”
23. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
25. Parallelism.
Thank you G5K
Multi-core machines
Clusters
Grids
Sometimes parallelization completely changes
the picture.
Sometimes not.
We want to know when.
26. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases Robustness,
Sequential decision making slow rates.
Fundamental facts
Monte-Carlo Tree Search
Conclusion
31. Derivative-free optimization of f
Why derivative free optimization ?
Ok, it's slower
But sometimes you have no derivative
It's simpler (by far) ==> less bugs
32. Derivative-free optimization of f
Why derivative free optimization ?
Ok, it's slower
But sometimes you have no derivative
It's simpler (by far)
It's more robust (to noise, to strange functions...)
33. Derivative-free optimization of f
Optimization algorithms
==> Newton optimization ?
Why derivative free
==> Quasi-Newton (BFGS)
Ok, it's slower
But sometimes you have no derivative
==> Gradient descent
It's simpler (by far)
==> ...robust (to noise, to strange functions...)
It's more
34. Derivative-free optimization of f
Optimization algorithms
Why derivative free optimization ?
Ok, it's slower
Derivative-free optimization
But sometimes you have no derivative
(don't need gradients)
It's simpler (by far)
It's more robust (to noise, to strange functions...)
35. Derivative-free optimization of f
Optimization algorithms
Why derivative free optimization ?
Derivative-free optimization
Ok, it's slower
But sometimes you have no derivative
Comparison-based optimization
(coming soon),
It's simpler (by far)comparisons,
just needing
It's more robust (to noise, to strange functions...)
including evolutionary algorithms
39. Comparison-based algorithms are robust
Consider
f: X --> R
We look for x* such that
x,f(x*) ≤ f(x)
==> what if we see g o f (g increasing) ?
==> x* is the same, but xn might change
parallel evolution 39
40. Robustness of comparison-based algorithms: formal statement
this does not depend on g for a
comparison-based algorithm
a comparison-based algorithm is optimal
for
parallel evolution 40
41. Complexity bounds (N = dimension)
= nb of fitness evaluations for precision
with probability at least ½ for all f
Exp ( - Convergence ratio ) = Convergence rate
Convergence ratio ~ 1 / computational cost
==> more convenient than conv. rate for speed-ups
parallel evolution 41
42. Complexity bounds: basic technique
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 42
43. Complexity bounds: basic technique
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 43
44. Complexity bounds: basic technique
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 44
45. Complexity bounds: basic technique
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 45
46. Complexity bounds: -balls
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 46
47. Complexity bounds: -balls
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 47
48. Complexity bounds: -balls
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
parallel evolution 48
49. Complexity bounds: basic technique
We want to know how many iterations we need for reaching precision
in an evolutionary algorithm.
Key observation: (most) evolutionary algorithms are comparison-based
Let's consider (for simplicity) a deterministic selection-based non-elitist
algorithm
First idea: how many different branches we have in a run ?
We select points among
Therefore, at most K = ! / ( ! ( - )!) different branches
Second idea: how many different answers should we able to give ?
Use packing numbers: at least N() different possible answers
Conclusion: the number n of iterations should verify
Kn ≥ N ( )
parallel evolution 49
50. Complexity bounds on the convergence ratio
FR: full ranking (selected points are ranked)
SB: selection-based (selected points are not ranked)
parallel evolution 50
51. Complexity bounds on the convergence ratio
This is why I love
cross-over.
FR: full ranking (selected points are ranked)
SB: selection-based (selected points are not ranked)
parallel evolution 51
52. Complexity bounds on the convergence ratio
Fournier, T., 2009;
using VC-dim.
FR: full ranking (selected points are ranked)
SB: selection-based (selected points are not ranked)
parallel evolution 52
53. Complexity bounds on the convergence ratio
Quadratic functions easier
than sphere functions ?
But not for translation invariant
quadratic functions...
FR: full ranking (selected points are ranked)
SB: selection-based (selected points are not ranked)
parallel evolution 53
54. Complexity bounds on the convergence ratio
Quadratic functions easier
than sphere functions ?
But not for translation invariant
quadratic functions...
FR: full ranking (selected points are ranked) results.
Covers existing
SB: selection-based (selected pointswith discrete domains.
Compliant are not ranked)
parallel evolution 54
55. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
1) Mathematical proof that all
Parallelization
comparison-based algorithms
Noisy cases
can be parallelized
(log speed-up)
Sequential decision making
Fundamental facts
2) Practical hint: simple tricks
Monte-Carlo Tree Search
for some well-known algorithms
Conclusion
59. Speculative parallelization with branching factor 3
Parallel version for D=2.
Population = union of all pops for 2 iterations.
parallel evolution 59
61. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
1) Mathematical proof that all
Parallelization
comparison-based algorithms
Noisy cases
can be parallelized
(log speed-up)
Sequential decision making
Fundamental facts
2) Practical hint: simple tricks
Monte-Carlo Tree Search
for some well-known algorithms
Conclusion
62. Define:
Necessary condition for log() speed-up:
- E log( * ) ~ log()
But for many algorithms,
- E log( * ) = O(1)
==> asymptotically constant speed-up
63. These algos do not reach the log(lambda) speed-up.
th
(1+1)-ES with 1/5 rule
Standard CSA
Standard EMNA
Standard SA.
Teytaud, T, PPSN 2010
64. Example 1: Estimation of Multivariate Normal Algorithm
While ( I have time )
{
Generate points (x1,...,x) distributed as N(x,)
Evaluate the fitness at x1,...,x
X= mean best points
= standard deviation of best points
/= log( / 7)1 / d
}
65. Ex 2: Log(lambda) correction for mutative self-adapt.
= min( /4,d)
While ( I have time )
{
Generate points (1,...,) as x exp(- k.N)
Generate points (x1,...,x) distributed as N(x,i)
Select the best points
Update x (=mean), update (=log. mean)
}
66. Log() corrections (SA, dim 3)
● In the discrete case (XPs): automatic
parallelization surprisingly efficient.
● Simple trick in the continuous case
- E log( *) should be linear in log()
(this provides corrections which
work for SA and CSA)
parallel evolution 66
67. Log() corrections
● In the discrete case (XPs): automatic
parallelization surprisingly efficient.
● Simple trick in the continuous case
- E log( *) should be linear in log()
(this provides corrections which
work for SA and CSA)
parallel evolution 67
68. SUMMARY of the EA part up to now:
- evolutionary algorithms are robust (with
a precise statement of this robustness)
- evolutionary algorithms are somehow
slow (precisely quantified...)
- evolutionary algorithms are parallel (at least
“until” the dimension for the conv. rate)
69. SUMMARY of the EA part up to now:
- evolutionary algorithms are robust (with
a precise statement of this robustness)
- evolutionary algorithms are somehow
slow (precisely quantified...)
- evolutionary algorithms are parallel (at least
“until” the dimension for the conv. rate)
Now, noisy optimization
70. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
71. Many works focus on fitness functions with “small” noise:
f(x) = ||x||2 x (1+Gaussian )
This is because the more realistic case
f(x) = ||x||2 + Gaussian (variance >0 at optimum)
is too hard for publishing nice curves.
72. Many works focus on fitness functions with “small” noise:
f(x) = ||x||2 x (1+Gaussian )
This is because the more realistic case
f(x) = ||x||2 + Gaussian
is too hard for publishing nice curves.
==> see however Arnold Beyer 2006.
==> a tool: races ( Heidrich-Meisner et al, Icml 2009)
- reevaluating until statistically significant differences
- … but we must (sometimes) limit the number of
reevaluations
73. Another difficult case: Bernoulli functions.
fitness(x) = B( f(x) )
f(0) not necessarily = 0.
74. Another difficult case: Bernoulli functions.
EDA
Based on
fitness(x) = B( f(x) )
+ races MaxUncertainty
f(0) not necessarily = 0. (Coulom)
75. Another difficult case: Bernoulli functions.
EDA
Based on
fitness(x) = B( f(x) )
+ races MaxUncertainty
f(0) not necessarily = 0. (Coulom)
I like this case
With p=2
with p=2
76. Another difficult case: Bernoulli functions.
EDA
Based on
fitness(x) = B( f(x) )
+ races MaxUncertainty
f(0) not necessarily = 0. (Coulom)
I like this case
With p=2
with p=2
77. Another difficult case: Bernoulli functions.
EDA
Based on
fitness(x) = B( f(x) )
+ races MaxUncertainty
f(0) not necessarily = 0. (Coulom)
We prove good
results here.
I like this case
With p=2
with p=2
78. Another difficult case: Bernoulli functions.
EDA
Based on
fitness(x) = B( f(x) )
+ races MaxUncertainty
f(0) not necessarily = 0. (Coulom)
We prove good
results here.
We prove good
I like this case results here.
With p=2
with p=2
79. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
80. The game of Go is a part of AI.
Computers are ridiculous in front of children.
Easy situation.
Termed “semeai”.
Requires a little bit
of abstraction.
81. The game of Go is a part of AI.
Computers are ridiculous in front of children.
800 cores, 4.7
GHz,
top level program.
Plays a stupid
move.
82. The game of Go is a part of AI.
Computers are ridiculous in front of children.
8 years old;
little training;
finds the good move
83. Introduction
What is AI ?
Why evolutionary optimization is a part of AI
Why parallelism ?
Evolutionary computation
Comparison-based optimization
Parallelization
Noisy cases
Sequential decision making
Fundamental facts
Monte-Carlo Tree Search
Conclusion
84. Monte-Carlo Tree Search
1. Games (a bit of formalism)
2. Decidability / complexity
Games with simultaneous actions 84 Paris 1st of February
85. A game is a directed graph
parallel evolution 85
86. A game is a directed graph with actions
1
2
3
parallel evolution 86
87. A game is a directed graph with actions and players
1 White
Black
2
3
White 12
43
White Black
Black
Black
Black
parallel evolution 87
88. A game is a directed graph with actions
and players and observations
Bob
Bear Bee
Bee 1 White
Black
2
3
White 12
43
White Black
Black
Black
Black
parallel evolution 88
89. A game is a directed graph with actions
and players and observations and rewards
Bob
Bear Bee
Bee 1 White
Black
2
+1
3
0
White 12
Rewards
43
White Black on leafs
Black
only!
Black
Black
parallel evolution 89
90. A game is a directed graph +actions
+players +observations +rewards +loops
Bob
Bear Bee
Bee 1 White
Black
2
+1
3
0
White 12
43
White Black
Black
Black
Black
parallel evolution 90
91. Monte-Carlo Tree Search
1. Games (a bit of formalism)
2. Decidability / complexity
Games with simultaneous actions 91 Paris 1st of February
92. Complexity (2P, no random)
Unbounded Exponential Polynomial
horizon horizon horizon
Full Observability EXP EXP PSPACE
No obs EXPSPACE NEXP
(X=100%) (Hasslum et al, 2000)
Partially 2EXP EXPSPACE
Observable (Rintanen, 97)
(X=100%)
Simult. Actions ? EXPSPACE ? <<<= EXP <<<= EXP
No obs / PO undecidable
93. Complexity question ? (UD)
Instance = position.
Question = Is there a strategy
which wins whatever
are the decisions
of the opponent ?
= natural question if full observability.
Answering this question then allows perfect play.
94. Hummm ?
Do you know a PO game in which you can
ensure a win with probability 1 ?
95. Complexity question for matrix
game ?
100000
010000 Good for column-player !
001000
==> but no sure win.
000100
==> the “UD” question is not
000010
relevant here!
000001
96. Complexity question for Joint work with
phantom-games ? F. Teytaud
This is phantom-go.
Good for black: wins
with proba 1-1/(8!)
Here,
there's no move
which ensures a win.
But some moves are
much better than
others!
99. Madani et al.
1 player + random = undecidable.
We extend to two players with no random.
Problem: rewrite random nodes, thanks to
additional player.
102. A random node to be rewritten
Rewritten as follows:
Player 1 chooses a in [[0,N-1]]
Player 2 chooses b in [[0,N-1]]
c=(a+b) modulo N
Go to tc
Each player can force the game to be equivalent to
the initial one (by playing uniformly)
==> the proba of winning for player 1 (in case of perfect play)
is the same as for the initial game
==> undecidability!
103. Important remark
Existence of a strategy for winning with
proba > 0.5
==> also undecidable for the
restriction to games in which the proba
is >0.6 or <0.4
==> not just a subtle
precision trouble.
116. ... or exploration ?
SCORE =
0/2
+ k.sqrt( log(10)/2 )
Binary win/loss
games: no explo!
(Berthier, D., T., 2010)
117. Games vs pros
in the game of Go
First win in 9x9
First win over 5 games in 9x9 blind Go
First win with H2.5 in 13x13 Go
First win with H6 in 19x19 Go
First win with H7 in 19x19 Go vs top pro
118. ... or exploration ?
SCORE =
0/2
+ k.sqrt( log(10)/2 )
Simultaneous actions:
replace it with
EXP3 / INF
119. MCTS for simultaneous actions
Player 1 plays
Player 2 plays Both players
play
... Player 1 plays
Player 2 plays
120. MCTS for simultaneous actions
Player 1 plays
= maxUCB node
Player 2 plays
Both players play
=minUCB node
=EXP3 node
Player 1 plays
... Player 2 plays
=maxUCB node
=minUCB node
121. MCTS for hidden information
Player 1
Observation set 1 Observation set 2
EXP3 node EXP3 node
Observation set 3
EXP3 node
Player 2
Observation set 2
Observation set 1
EXP3 node
EXP3 node
Observation set 3
EXP3 node
122. MCTS for hidden information
Player 1
Observation set 1 Observation set 2
EXP3 node EXP3 node
Observation set 3
EXP3 node Thanks Martin
(incrementally + application to phantom-tic-tac-toe: see D. Auger 2010)
Player 2
Observation set 2
Observation set 1
EXP3 node
EXP3 node
Observation set 3
EXP3 node
123. EXP3 in one slide
Grigoriadis et al, Auer et al, Audibert & Bubeck Colt 2009
124. Monte-Carlo Tree Search
Appli to Urban Rivals ==>
(simultaneous actions)
Games with simultaneous actions 124 Paris 1st of February
125. Let's have fun with Urban Rivals (4 cards)
Each player has
- four cards (each one can be used once)
- 12 pilz (each one can be used once)
- 12 life points
Each card has:
- one attack level
- one damage
- special effects (forget that...)
Four turns:
P1 attacks P2, P2 attacks P1,
P1 attacks P2, P2 attacks P1.
parallel evolution 125
126. Let's have fun with Urban Rivals
First, attacker plays:
- chooses a card
- chooses ( PRIVATELY ) a number of pilz
Attack level = attack(card) x (1+nb of pilz)
Then, defender plays:
- chooses a card
- chooses a number of pilz
Defense level = attack(card) x (1+nb of pilz)
Result:
If attack > defense
Defender looses Power(attacker's card)
Else
Attacker looses Power(defender's card)
parallel evolution 126
127. Let's have fun with Urban Rivals
==> The MCTS-based AI is now at the best human level.
Experimental (only) remarks on EXP3:
- discard strategies with small number of sims
= better approx of the Nash
- also an improvement by taking
into account the other bandit
- virtual simulations (inspired
by Kummer)
parallel evolution 127
128. When is MCTS relevant ?
Robust in front of:
High dimension;
Non-convexity of Bellman values;
Complex models
Delayed reward
Simultaneous actions, partial information
More difficult for
High values of H;
Model-free
Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree
Search, see Cazenave et al.)
Lack of reasonable baseline for the MC
129. When is MCTS relevant ?
T., Dagstuhl 2010, D. Auger,
Robust in front of: EvoStar 2011.
EvoStar 2011;
High dimension;
Unpublished
Non-convexity of Bellman values;
Complex models results on
Delayed reward Some endgames
undecidability
Simultaneous actions
More difficult for results
High values of H;
Model-free
Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree
Search, see Cazenave et al.)
Lack of reasonable baseline for the MC
130. Conclusion
Evo. Opt: robustness, tight bounds, simple
algorithmic modifs for better speed-up (SA, 1/5th,
(CSA))
MCTS just great (but requires a model); UCB
not necessary; extension to hidden info (rmk:
undecidability); PO endgames; but no abstraction
power.
Noisy optimization: Consider high noise. Use
QR and Learning (in all EA in fact).
Not mentioned here: multimodal, multiobj, GP, bandits.
131. Future ?
- Solving semeais ? Would involve great AI progress I think...
- Noisy optimization; there are still things to be done.
==> Promoting high noise fitness functions even if it is less
publication-efficient.
- ``Inheritance'' of belief state in partially observable games.
Big progress to be done. Crucial for applications.
- Sparse bandits / mixed stochastic/adversarial cases.
Thanks for your attention.
Thanks to all collaborators for all I've learnt with them.
133. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
134. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
135. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
136. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
137. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
138. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
139. MCTS with hidden information:
incremental version
While (there is time for thinking)
{
s=initial state
os(1)=() os(2)=()
while (s not terminal)
{
p=player(s)
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
140. MCTS with hidden information:
incremental version
While (there is time for thinking)
{ Possibly
s=initial state
os(1)=() os(2)=() refine
while (s not terminal)
the family
{
p=player(s) of bandits.
b=Exp3Bandit(os(p))
d=b.makeDecision
(s,o)=transition(s,d)
}
send reward to all bandits in the simulation
}
Hinweis der Redaktion
I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse