SlideShare ist ein Scribd-Unternehmen logo
1 von 140
HABILITATION




             Artificial intelligence
                   with Parallelism
    Acknowledgments:
    All the TAO team. People in Liège, Taiwan, Lri,Artelys, Mash, Iomca, ..,
    Thanks a lot to the committee.
    Thanks + good recovery to Jonathan Shapiro.
    Thanks to Grid5000.


Olivier Teytaud     olivier.teytaud@inria.fr
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
AI = using computers where they
    are weak / weaker than humans.
                                             (thanks Michèle S.)

Difficult optimization (complex structure,
                     noisy objective functions)
Games (difficult ones)

Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
         (very) approximately solve it
OR = choosing the best model that you can solve almost exactly
AI = using computers where they
    are weak / weaker than humans.
                                             (thanks Michèle S.)

Difficult optimization (complex structure,
                     noisy objective functions)
Games (difficult ones)

Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
         (very) approximately solve it
OR = choosing the best model that you can solve almost exactly
AI = using computers where they
    are weak / weaker than humans.
                                             (thanks Michèle S.)

Difficult optimization (complex structure,
                     noisy objective functions)
Games (difficult ones)

Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
         (very) approximately solve it
OR = choosing the best model that you can solve almost exactly
AI = using computers where they
    are weak / weaker than humans.
                                             (thanks Michèle S.)

Difficult optimization (complex structure,
                     noisy objective functions)
Games (difficult ones)

Key difference with many operational research works:
AI = choosing a model as close as possible to reality and
         (very) approximately solve it
OR = choosing the best model that you can solve almost exactly
Many works are about numbers.
Providing standard deviations, rates, etc.


Other goal (more ambitious ?):
  switching from something which does not work
   to something which works.


E.g. vision; a computer can distinguish:
But it can't distinguish so easily:
And it's a disaster for categorizing
- children,
- women,
- panda,
- babies,
- children
- men,
- bears,
- trucks,
- cars.
And it's a disaster for categorizing children,
women, panda, babies, children, men, bears, trucks, cars.
And it's a disaster for categorizing children,
women, panda, babies, children, men, bears, trucks, cars.




                                                 3 years old;
                                                 she can do it.
==> AI= focus on things which do not
    work and (hopefully) make them work.
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
Evolutionary optimization is a part of A.I.

Often considered as bad, because many
 EO tools are not that hard,
 mathematically speaking.

I've met people using
 - randomized mutations
 - cross-overs
but who did not call this evolutionary or
  genetic, because it would be bad.
Gives a lot freedom:
 - choose your operators (depending on the problem)
 - choose your population-size (depending on your
                       computer/grid )

 - choose  (carefully) e.g. min(dimension,  /4)




==> Can work on strange domains
Voronoi representation of a shape:
        - a family of points         (thanks Marc S.)
Voronoi representation:
  - a family of points
Voronoi representation:
  - a family of points
     - their labels
Voronoi representation:
     - a family of points
        - their labels
==> cross-over makes sense
==> you can optimize a shape
Voronoi representation:
     - a family of points
        - their labels
==> cross-over makes sense
==> you can optimize a shape
Voronoi representation:
     - a family of points
        - their labels
==> cross-over makes sense
==> you can optimize a shape



                             Great substitute for
                                 averaging.
                                “on the benefit of sex”
Cantilever optimization:




                      Hamda et al, 2000
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
Parallelism.

Multi-core machines
Clusters
Grids

Sometimes parallelization completely changes
the picture.
Parallelism.

                             Thank you G5K
Multi-core machines
Clusters
Grids

Sometimes parallelization completely changes
the picture.
Sometimes not.
We want to know when.
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases                       Robustness,
Sequential decision making             slow rates.
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
Derivative-free optimization of f




                               No gradient !
                     Only depends on the x's and f(x)'s
Derivative-free optimization of f




      Why derivative free optimization ?
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
       It's simpler (by far) ==> less bugs
Derivative-free optimization of f




      Why derivative free optimization ?
       Ok, it's slower
       But sometimes you have no derivative
       It's simpler (by far)
       It's more robust (to noise, to strange functions...)
Derivative-free optimization of f


          Optimization algorithms
      ==> Newton optimization ?
      Why derivative free
       ==> Quasi-Newton (BFGS)
       Ok, it's slower
       But sometimes you have no derivative
       ==> Gradient descent
       It's simpler (by far)

       ==> ...robust (to noise, to strange functions...)
       It's more
Derivative-free optimization of f



          Optimization algorithms
      Why derivative free optimization ?
       Ok, it's slower
        Derivative-free        optimization
       But sometimes you have no derivative
           (don't need gradients)
       It's simpler (by far)
       It's more robust (to noise, to strange functions...)
Derivative-free optimization of f



          Optimization algorithms
      Why derivative free optimization ?
        Derivative-free optimization
       Ok, it's slower
       But sometimes you have no derivative
              Comparison-based optimization
                      (coming soon),
       It's simpler (by far)comparisons,
                  just needing
       It's more robust (to noise, to strange functions...)
               including evolutionary algorithms
Comparison-based optimization


                                          yi=f(xi)




        is comparison-based if




                                 parallel evolution   36
Population-based comparison-based algorithms




  X(1)=( x(1,1),x(1,2),...,x(1,) ) = Opt()
  X(2)=( x(2,1),x(2,2),...,x(2,) ) = Opt(x(1),
                                      signs of diff)
           …            …             ...
  x(n)=( x(n,1),x(n,2),...,x(n,) ) = Opt(x(n-1),
                                      signs of diff)



                                        parallel evolution   37
P-based c-based algorithms w/ internal state




( X(1)=( x(1,1),x(1,2),...,x(1,) ),I(1) ) = Opt()
( X(2)=( x(2,1),x(2,2),...,x(2,) ),I(2) ) = Opt(x(1),I(1),
                                     signs of diff)
          …          …               ...
( x(n)=( x(n,1),x(n,2),...,x(n,) ),I(n) ) = Opt(x(n-1),I(n),
                                     signs of diff)



                                        parallel evolution   38
Comparison-based algorithms are robust



 Consider
               f: X --> R
 We look for x* such that
                  x,f(x*) ≤ f(x)
 ==> what if we see g o f (g increasing) ?
 ==> x* is the same, but xn might change




                                 parallel evolution   39
Robustness of comparison-based algorithms: formal statement




   this does not depend on g for a
        comparison-based algorithm
   a comparison-based algorithm is optimal
    for




                                         parallel evolution   40
Complexity bounds (N = dimension)




         = nb of fitness evaluations for precision
             with probability at least ½ for all f


   Exp ( - Convergence ratio ) = Convergence rate


   Convergence ratio ~ 1 / computational cost
      ==> more convenient than conv. rate for speed-ups

                                           parallel evolution   41
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    42
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    43
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    44
Complexity bounds: basic technique
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    45
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    46
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    47
Complexity bounds: -balls
  We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.


 Key observation: (most) evolutionary algorithms are comparison-based


  Let's consider (for simplicity) a deterministic selection-based non-elitist
   algorithm


  First idea: how many different branches we have in a run ?
     We select  points among 
     Therefore, at most K = ! / ( ! (  -  )!) different branches


  Second idea: how many different answers should we able to give ?
     Use packing numbers: at least N() different possible answers

                                                          parallel evolution    48
Complexity bounds: basic technique
 We want to know how many iterations we need for reaching precision 
     in an evolutionary algorithm.



 Key observation: (most) evolutionary algorithms are comparison-based



  Let's consider (for simplicity) a deterministic selection-based non-elitist
    algorithm



  First idea: how many different branches we have in a run ?
        We select  points among 
        Therefore, at most K = ! / ( ! (  -  )!) different branches



  Second idea: how many different answers should we able to give ?
        Use packing numbers: at least N() different possible answers




 Conclusion: the number n of iterations should verify
                                       Kn ≥ N (  )

                                                                                parallel evolution   49
Complexity bounds on the convergence ratio




    FR: full ranking (selected points are ranked)
    SB: selection-based (selected points are not ranked)
                                           parallel evolution   50
Complexity bounds on the convergence ratio




                       This is why I love
                          cross-over.




    FR: full ranking (selected points are ranked)
    SB: selection-based (selected points are not ranked)
                                            parallel evolution   51
Complexity bounds on the convergence ratio




   Fournier, T., 2009;
     using VC-dim.




    FR: full ranking (selected points are ranked)
    SB: selection-based (selected points are not ranked)
                                           parallel evolution   52
Complexity bounds on the convergence ratio
          Quadratic functions easier
           than sphere functions ?
       But not for translation invariant
             quadratic functions...




    FR: full ranking (selected points are ranked)
    SB: selection-based (selected points are not ranked)
                                           parallel evolution   53
Complexity bounds on the convergence ratio
          Quadratic functions easier
           than sphere functions ?
       But not for translation invariant
             quadratic functions...




    FR: full ranking (selected points are ranked) results.
                                     Covers existing
    SB: selection-based (selected pointswith discrete domains.
                              Compliant are not ranked)
                                           parallel evolution    54
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
                              1) Mathematical proof that all
   Parallelization
                                 comparison-based algorithms
   Noisy cases
                                 can be parallelized
                                   (log speed-up)
Sequential decision making
   Fundamental facts
                                 2) Practical hint: simple tricks
   Monte-Carlo Tree Search
                                   for some well-known algorithms

Conclusion
Speculative parallelization with branching factor 3




           Consider the sequential algorithm.
           (iteration 1)




                                    parallel evolution   56
Speculative parallelization with branching factor 3




    Consider the sequential algorithm.
    (iteration 2)

                                    parallel evolution   57
Speculative parallelization with branching factor 3




     Consider the sequential algorithm.
     (iteration 3)
                                    parallel evolution   58
Speculative parallelization with branching factor 3




  Parallel version for D=2.
  Population = union of all pops for 2 iterations.


                                     parallel evolution   59
Automatic parallelization




                            Teytaud, T, PPSN 2010
                                    parallel evolution   60
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
                              1) Mathematical proof that all
   Parallelization
                                 comparison-based algorithms
   Noisy cases
                                 can be parallelized
                                   (log speed-up)
Sequential decision making
   Fundamental facts
                                 2) Practical hint: simple tricks
   Monte-Carlo Tree Search
                                   for some well-known algorithms

Conclusion
Define:

Necessary condition for log() speed-up:
 - E log( * ) ~ log()

But for many algorithms,
- E log( * ) = O(1)
       ==> asymptotically constant speed-up
These algos do not reach the log(lambda) speed-up.


                              th
         (1+1)-ES with 1/5 rule
              Standard CSA
             Standard EMNA
               Standard SA.



                       Teytaud, T, PPSN 2010
Example 1: Estimation of Multivariate Normal Algorithm


    While ( I have time )
     {
         Generate points (x1,...,x) distributed as N(x,)
         Evaluate the fitness at x1,...,x
         X= mean  best points

  = standard deviation of  best points

         /= log( / 7)1 / d
     }
Ex 2: Log(lambda) correction for mutative self-adapt.



       = min( /4,d)
      While ( I have time )
      {
           Generate points (1,...,) as  x exp(- k.N)
           Generate points (x1,...,x) distributed as N(x,i)
           Select the  best points
           Update x (=mean), update (=log. mean)

  }
Log() corrections (SA, dim 3)



  ●   In the discrete case (XPs): automatic
            parallelization surprisingly efficient.

  ●   Simple trick in the continuous case
         - E log( *) should be linear in log()

      (this provides corrections which
         work for SA and CSA)

                                         parallel evolution   66
Log() corrections



  ●   In the discrete case (XPs): automatic
            parallelization surprisingly efficient.

  ●   Simple trick in the continuous case
         - E log( *) should be linear in log()

      (this provides corrections which
         work for SA and CSA)

                                         parallel evolution   67
SUMMARY of the EA part up to now:
 - evolutionary algorithms are robust (with
     a precise statement of this robustness)
 - evolutionary algorithms are somehow
         slow (precisely quantified...)
 - evolutionary algorithms are parallel (at least
     “until” the dimension for the conv. rate)
SUMMARY of the EA part up to now:
 - evolutionary algorithms are robust (with
     a precise statement of this robustness)
 - evolutionary algorithms are somehow
         slow (precisely quantified...)
 - evolutionary algorithms are parallel (at least
     “until” the dimension for the conv. rate)


                              Now, noisy optimization
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
Many works focus on fitness functions with “small” noise:
         f(x) = ||x||2 x (1+Gaussian )


This is because the more realistic case
         f(x) = ||x||2 + Gaussian (variance >0 at optimum)
is too hard for publishing nice curves.
Many works focus on fitness functions with “small” noise:
         f(x) = ||x||2 x (1+Gaussian )


This is because the more realistic case
         f(x) = ||x||2 + Gaussian
is too hard for publishing nice curves.


==> see however Arnold Beyer 2006.
==> a tool: races     ( Heidrich-Meisner et al, Icml 2009)
  - reevaluating until statistically significant differences
  - … but we must (sometimes) limit the number of
      reevaluations
Another difficult case: Bernoulli functions.

        fitness(x) = B( f(x) )
         f(0) not necessarily = 0.
Another difficult case: Bernoulli functions.
      EDA
                                                 Based on
       fitness(x) = B( f(x) )
     + races                                   MaxUncertainty
        f(0) not necessarily = 0.                (Coulom)
Another difficult case: Bernoulli functions.
      EDA
                                                 Based on
       fitness(x) = B( f(x) )
     + races                                   MaxUncertainty
        f(0) not necessarily = 0.                (Coulom)




        I like this case
          With p=2
          with p=2
Another difficult case: Bernoulli functions.
      EDA
                                                 Based on
       fitness(x) = B( f(x) )
     + races                                   MaxUncertainty
        f(0) not necessarily = 0.                (Coulom)




        I like this case
          With p=2
          with p=2
Another difficult case: Bernoulli functions.
      EDA
                                                 Based on
       fitness(x) = B( f(x) )
     + races                                   MaxUncertainty
        f(0) not necessarily = 0.                (Coulom)

                      We prove good
                           results here.




        I like this case
          With p=2
          with p=2
Another difficult case: Bernoulli functions.
      EDA
                                                 Based on
       fitness(x) = B( f(x) )
     + races                                   MaxUncertainty
        f(0) not necessarily = 0.                (Coulom)

                      We prove good
                           results here.




                             We prove good
        I like this case       results here.
          With p=2
          with p=2
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
The game of Go is a part of AI.
Computers are ridiculous in front of children.


                                              Easy situation.
                                            Termed “semeai”.
                                            Requires a little bit
                                              of abstraction.
The game of Go is a part of AI.
Computers are ridiculous in front of children.
                                             800 cores, 4.7
                                                 GHz,
                                           top level program.


                                             Plays a stupid
                                                 move.
The game of Go is a part of AI.
Computers are ridiculous in front of children.


                                             8 years old;
                                             little training;
                                         finds the good move
Introduction
   What is AI ?
   Why evolutionary optimization is a part of AI
   Why parallelism ?


Evolutionary computation
   Comparison-based optimization
   Parallelization
   Noisy cases


Sequential decision making
   Fundamental facts
   Monte-Carlo Tree Search


Conclusion
Monte-Carlo Tree Search




1. Games (a bit of formalism)

2. Decidability / complexity




 Games with simultaneous actions   84   Paris 1st of February
A game is a directed graph




                             parallel evolution   85
A game is a directed graph with actions


          1


          2
     3




                                parallel evolution   86
A game is a directed graph with actions and players

             1                White
Black
             2
        3

                    White     12

                             43
            White                          Black
                                  Black

                     Black
            Black

                                          parallel evolution   87
A game is a directed graph with actions
and players and observations
 Bob
                         Bear       Bee
       Bee    1                 White
  Black
              2
        3

                     White      12

                              43
             White                          Black
                                   Black

                      Black
             Black

                                           parallel evolution   88
A game is a directed graph with actions
and players and observations and rewards
 Bob
                         Bear       Bee
       Bee    1                 White
  Black
              2
                                                                +1
        3
                                              0
                     White      12
                                                                Rewards
                              43
             White                          Black               on leafs
                                   Black
                                                                 only!
                      Black
             Black

                                           parallel evolution              89
A game is a directed graph +actions
+players +observations +rewards +loops
 Bob
                         Bear       Bee
       Bee    1                 White
  Black
              2
                                                                +1
        3
                                              0
                     White      12

                              43
             White                          Black
                                   Black

                      Black
             Black

                                           parallel evolution        90
Monte-Carlo Tree Search




1. Games (a bit of formalism)

2. Decidability / complexity




 Games with simultaneous actions   91   Paris 1st of February
Complexity (2P, no random)

                       Unbounded               Exponential   Polynomial
                         horizon                horizon        horizon

Full Observability        EXP                   EXP          PSPACE
No obs                  EXPSPACE                 NEXP
(X=100%)               (Hasslum et al, 2000)

Partially                 2EXP                  EXPSPACE
Observable               (Rintanen, 97)

(X=100%)
Simult. Actions      ? EXPSPACE ? <<<= EXP                   <<<= EXP
No obs / PO            undecidable
Complexity question ?                            (UD)

              Instance = position.
        Question = Is there a strategy
                        which wins whatever
                         are the decisions
                         of the opponent ?
 = natural question if full observability.
 Answering this question then allows perfect play.
Hummm ?




 Do you know a PO game in which you can
 ensure a win with probability 1 ?
Complexity question for matrix
 game ?

 100000
 010000          Good for column-player !

 001000
                 ==> but no sure win.
 000100
                 ==> the “UD” question is not
 000010
                  relevant here!
 000001
Complexity question for        Joint work with
 phantom-games ?                 F. Teytaud


                     This is phantom-go.
                     Good for black: wins
                     with proba 1-1/(8!)
                     Here,
                     there's no move
                     which ensures a win.
                     But some moves are
                     much better than
                     others!
Another formalization




                     c




  ==> much more satisfactory
Madani et al.




                      c




  1 player + random = undecidable.
Madani et al.


1 player + random = undecidable.


We extend to two players with no random.
Problem: rewrite random nodes, thanks to
additional player.
A random node to be rewritten
A random node to be rewritten
A random node to be rewritten

Rewritten as follows:
Player 1 chooses a in [[0,N-1]]
Player 2 chooses b in [[0,N-1]]
c=(a+b) modulo N
Go to tc
Each player can force the game to be equivalent to
the initial one (by playing uniformly)
==> the proba of winning for player 1 (in case of perfect play)
   is the same as for the initial game
==> undecidability!
Important remark



Existence of a strategy for winning with
proba > 0.5
==> also undecidable for the
  restriction to games in which the proba
  is >0.6 or <0.4
==> not just a subtle
  precision trouble.
Monte-Carlo Tree Search




 MCTS principle


 But with
  EXP3 in nodes for
  hidden information.
UCT (Upper Confidence Trees)




Coulom (06)
Chaslot, Saito & Bouzy (06)
Kocsis Szepesvari (06)
UCT
UCT
UCT
UCT
UCT
      Kocsis & Szepesvari (06)
Exploitation ...
Exploitation ...
            SCORE =
                5/7
             + k.sqrt( log(10)/7 )
Exploitation ...
            SCORE =
                5/7
             + k.sqrt( log(10)/7 )
Exploitation ...
            SCORE =
                5/7
             + k.sqrt( log(10)/7 )
... or exploration ?
              SCORE =
                  0/2
               + k.sqrt( log(10)/2 )
... or exploration ?
                  SCORE =
                      0/2
                   + k.sqrt( log(10)/2 )




            Binary win/loss
           games: no explo!
         (Berthier, D., T., 2010)
Games vs pros
        in the game of Go
First win in 9x9

First win over 5 games in 9x9 blind Go

First win with H2.5 in 13x13 Go

First win with H6 in 19x19 Go

First win with H7 in 19x19 Go vs top pro
... or exploration ?
                    SCORE =
                        0/2
                     + k.sqrt( log(10)/2 )




        Simultaneous actions:
            replace it with
             EXP3 / INF
MCTS for simultaneous actions


                 Player 1 plays




      Player 2 plays              Both players
                                        play




...                                            Player 1 plays
                       Player 2 plays
MCTS for simultaneous actions


                  Player 1 plays
                 = maxUCB node



      Player 2 plays
                               Both players play
      =minUCB node
                                   =EXP3 node




                                            Player 1 plays
...                    Player 2 plays
                                           =maxUCB node
                       =minUCB node
MCTS for hidden information
Player 1

              Observation set 1       Observation set 2
                EXP3 node             EXP3 node
                            Observation set 3
                                  EXP3 node
   Player 2




                                        Observation set 2
                 Observation set 1
                                      EXP3 node
                   EXP3 node
                             Observation set 3
                                     EXP3 node
MCTS for hidden information
Player 1

              Observation set 1       Observation set 2
                EXP3 node             EXP3 node
                            Observation set 3
                                  EXP3 node                 Thanks Martin


(incrementally + application to phantom-tic-tac-toe: see D. Auger 2010)
   Player 2




                                        Observation set 2
                 Observation set 1
                                      EXP3 node
                   EXP3 node
                             Observation set 3
                                     EXP3 node
EXP3 in one slide




Grigoriadis et al, Auer et al, Audibert & Bubeck Colt 2009
Monte-Carlo Tree Search




Appli to Urban Rivals ==>


(simultaneous actions)




 Games with simultaneous actions   124   Paris 1st of February
Let's have fun with Urban Rivals (4 cards)
 Each player has
   - four cards (each one can be used once)
   - 12 pilz (each one can be used once)
   - 12 life points
 Each card has:
   - one attack level
   - one damage
   - special effects (forget that...)
 Four turns:
 P1 attacks P2, P2 attacks P1,
 P1 attacks P2, P2 attacks P1.


                                              parallel evolution   125
Let's have fun with Urban Rivals
First, attacker plays:
- chooses a card
- chooses ( PRIVATELY ) a number of pilz
 Attack level = attack(card) x (1+nb of pilz)

Then, defender plays:
 - chooses a card
 - chooses a number of pilz
 Defense level = attack(card) x (1+nb of pilz)


Result:
 If attack > defense
    Defender looses Power(attacker's card)
 Else
    Attacker looses Power(defender's card)
                                                 parallel evolution   126
Let's have fun with Urban Rivals

 ==> The MCTS-based AI is now at the best human level.


 Experimental (only) remarks on EXP3:
 - discard strategies with small number of sims
   = better approx of the Nash
 - also an improvement by taking
  into account the other bandit
 - virtual simulations (inspired
   by Kummer)

                                    parallel evolution   127
When is MCTS relevant ?

 Robust in front of:
High dimension;
Non-convexity of Bellman values;
Complex models
Delayed reward
Simultaneous actions, partial information
More difficult for
High values of H;
Model-free
Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree
Search, see Cazenave et al.)
Lack of reasonable baseline for the MC
When is MCTS relevant ?
                       T., Dagstuhl 2010,      D. Auger,
 Robust in front of:      EvoStar 2011.
                                            EvoStar 2011;
High dimension;
                                             Unpublished
Non-convexity of Bellman values;
Complex models                                results on
Delayed reward Some                           endgames
           undecidability
Simultaneous actions
More difficult for results
High values of H;
Model-free
Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree
Search, see Cazenave et al.)
Lack of reasonable baseline for the MC
Conclusion
Evo. Opt: robustness, tight bounds, simple
 algorithmic modifs for better speed-up (SA, 1/5th,
 (CSA))

MCTS just great (but requires a model); UCB
 not necessary; extension to hidden info (rmk:
 undecidability); PO endgames; but no abstraction
 power.

Noisy optimization: Consider high noise. Use
 QR and Learning (in all EA in fact).
Not mentioned here: multimodal, multiobj, GP, bandits.
Future ?
 - Solving semeais ? Would involve great AI progress I think...
 - Noisy optimization; there are still things to be done.
     ==> Promoting high noise fitness functions even if it is less
           publication-efficient.
 - ``Inheritance'' of belief state in partially observable games.
   Big progress to be done. Crucial for applications.
 - Sparse bandits / mixed stochastic/adversarial cases.


Thanks for your attention.
           Thanks to all collaborators for all I've learnt with them.
Appendix 1:
MCTS with hidden
   information
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{
    s=initial state
    os(1)=()     os(2)=()
    while (s not terminal)
    {
        p=player(s)
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}
MCTS with hidden information:
incremental version
While (there is time for thinking)
{                                                   Possibly
    s=initial state
    os(1)=()     os(2)=()                            refine
    while (s not terminal)
                                                   the family
    {
        p=player(s)                                of bandits.
        b=Exp3Bandit(os(p))
        d=b.makeDecision
        (s,o)=transition(s,d)
    }
    send reward to all bandits in the simulation

}

Weitere ähnliche Inhalte

Was ist angesagt?

Review of Metaheuristics and Generalized Evolutionary Walk Algorithm
Review of Metaheuristics and Generalized Evolutionary Walk AlgorithmReview of Metaheuristics and Generalized Evolutionary Walk Algorithm
Review of Metaheuristics and Generalized Evolutionary Walk AlgorithmXin-She Yang
 
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...남주 김
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation systemKimikazu Kato
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것NAVER Engineering
 
An Introduction To Applied Evolutionary Meta Heuristics
An Introduction To Applied Evolutionary Meta HeuristicsAn Introduction To Applied Evolutionary Meta Heuristics
An Introduction To Applied Evolutionary Meta Heuristicsbiofractal
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2Alireza Shafaei
 

Was ist angesagt? (8)

그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
Review of Metaheuristics and Generalized Evolutionary Walk Algorithm
Review of Metaheuristics and Generalized Evolutionary Walk AlgorithmReview of Metaheuristics and Generalized Evolutionary Walk Algorithm
Review of Metaheuristics and Generalized Evolutionary Walk Algorithm
 
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
An Introduction To Applied Evolutionary Meta Heuristics
An Introduction To Applied Evolutionary Meta HeuristicsAn Introduction To Applied Evolutionary Meta Heuristics
An Introduction To Applied Evolutionary Meta Heuristics
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2
 

Andere mochten auch

Choosing between several options in uncertain environments
Choosing between several options in uncertain environmentsChoosing between several options in uncertain environments
Choosing between several options in uncertain environmentsOlivier Teytaud
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsOlivier Teytaud
 
Artificial intelligence and blind Go
Artificial intelligence and blind GoArtificial intelligence and blind Go
Artificial intelligence and blind GoOlivier Teytaud
 
The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...Olivier Teytaud
 
Provocative statements around energy
Provocative statements around energyProvocative statements around energy
Provocative statements around energyOlivier Teytaud
 
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Olivier Teytaud
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Olivier Teytaud
 
Noisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsNoisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsOlivier Teytaud
 
Energy Management (production side)
Energy Management (production side)Energy Management (production side)
Energy Management (production side)Olivier Teytaud
 
Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new toolsOlivier Teytaud
 
Games with partial information
Games with partial informationGames with partial information
Games with partial informationOlivier Teytaud
 

Andere mochten auch (17)

Labex2012g
Labex2012gLabex2012g
Labex2012g
 
Choosing between several options in uncertain environments
Choosing between several options in uncertain environmentsChoosing between several options in uncertain environments
Choosing between several options in uncertain environments
 
Hydroelectricity
HydroelectricityHydroelectricity
Hydroelectricity
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power Systems
 
Artificial intelligence and blind Go
Artificial intelligence and blind GoArtificial intelligence and blind Go
Artificial intelligence and blind Go
 
Theory of games
Theory of gamesTheory of games
Theory of games
 
The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...The game of Go and energy; two nice computational intelligence problems (with...
The game of Go and energy; two nice computational intelligence problems (with...
 
Tutorialmcts
TutorialmctsTutorialmcts
Tutorialmcts
 
Provocative statements around energy
Provocative statements around energyProvocative statements around energy
Provocative statements around energy
 
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Noisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary AlgorithmsNoisy Optimization combining Bandits and Evolutionary Algorithms
Noisy Optimization combining Bandits and Evolutionary Algorithms
 
Grenoble
GrenobleGrenoble
Grenoble
 
Openoffice and Linux
Openoffice and LinuxOpenoffice and Linux
Openoffice and Linux
 
Energy Management (production side)
Energy Management (production side)Energy Management (production side)
Energy Management (production side)
 
Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new tools
 
Games with partial information
Games with partial informationGames with partial information
Games with partial information
 

Ähnlich wie Artificial Intelligence and Optimization with Parallelism

R Interface for TensorFlow
R Interface for TensorFlowR Interface for TensorFlow
R Interface for TensorFlowKevin Kuo
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimizationOlivier Teytaud
 
Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.Olivier Teytaud
 
Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiersbutest
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentElectronic Arts / DICE
 
Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataAnalyticsWeek
 
Parallelising Dynamic Programming
Parallelising Dynamic ProgrammingParallelising Dynamic Programming
Parallelising Dynamic ProgrammingRaphael Reitzig
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Annibale Panichella
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmKaniska Mandal
 
Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsEvgeniy Marinov
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine LearningEvolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine LearningUniversity of Maribor
 
Begin with Machine Learning
Begin with Machine LearningBegin with Machine Learning
Begin with Machine LearningNarong Intiruk
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 

Ähnlich wie Artificial Intelligence and Optimization with Parallelism (20)

R Interface for TensorFlow
R Interface for TensorFlowR Interface for TensorFlow
R Interface for TensorFlow
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
 
Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.
 
Machine Learning at Geeky Base 2
Machine Learning at Geeky Base 2Machine Learning at Geeky Base 2
Machine Learning at Geeky Base 2
 
Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiers
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game Development
 
Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigData
 
Parallelising Dynamic Programming
Parallelising Dynamic ProgrammingParallelising Dynamic Programming
Parallelising Dynamic Programming
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
 
Deep learning
Deep learningDeep learning
Deep learning
 
Optimization
OptimizationOptimization
Optimization
 
supervised.pptx
supervised.pptxsupervised.pptx
supervised.pptx
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning Algorithm
 
DAA UNIT 3
DAA UNIT 3DAA UNIT 3
DAA UNIT 3
 
Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine LearningEvolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
 
Begin with Machine Learning
Begin with Machine LearningBegin with Machine Learning
Begin with Machine Learning
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 

Kürzlich hochgeladen

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Kürzlich hochgeladen (20)

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

Artificial Intelligence and Optimization with Parallelism

  • 1. HABILITATION Artificial intelligence with Parallelism Acknowledgments: All the TAO team. People in Liège, Taiwan, Lri,Artelys, Mash, Iomca, .., Thanks a lot to the committee. Thanks + good recovery to Jonathan Shapiro. Thanks to Grid5000. Olivier Teytaud olivier.teytaud@inria.fr
  • 2. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 3. AI = using computers where they are weak / weaker than humans. (thanks Michèle S.) Difficult optimization (complex structure, noisy objective functions) Games (difficult ones) Key difference with many operational research works: AI = choosing a model as close as possible to reality and (very) approximately solve it OR = choosing the best model that you can solve almost exactly
  • 4. AI = using computers where they are weak / weaker than humans. (thanks Michèle S.) Difficult optimization (complex structure, noisy objective functions) Games (difficult ones) Key difference with many operational research works: AI = choosing a model as close as possible to reality and (very) approximately solve it OR = choosing the best model that you can solve almost exactly
  • 5. AI = using computers where they are weak / weaker than humans. (thanks Michèle S.) Difficult optimization (complex structure, noisy objective functions) Games (difficult ones) Key difference with many operational research works: AI = choosing a model as close as possible to reality and (very) approximately solve it OR = choosing the best model that you can solve almost exactly
  • 6. AI = using computers where they are weak / weaker than humans. (thanks Michèle S.) Difficult optimization (complex structure, noisy objective functions) Games (difficult ones) Key difference with many operational research works: AI = choosing a model as close as possible to reality and (very) approximately solve it OR = choosing the best model that you can solve almost exactly
  • 7. Many works are about numbers. Providing standard deviations, rates, etc. Other goal (more ambitious ?): switching from something which does not work to something which works. E.g. vision; a computer can distinguish:
  • 8. But it can't distinguish so easily:
  • 9. And it's a disaster for categorizing - children, - women, - panda, - babies, - children - men, - bears, - trucks, - cars.
  • 10. And it's a disaster for categorizing children, women, panda, babies, children, men, bears, trucks, cars.
  • 11. And it's a disaster for categorizing children, women, panda, babies, children, men, bears, trucks, cars. 3 years old; she can do it.
  • 12. ==> AI= focus on things which do not work and (hopefully) make them work.
  • 13. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 14. Evolutionary optimization is a part of A.I. Often considered as bad, because many EO tools are not that hard, mathematically speaking. I've met people using - randomized mutations - cross-overs but who did not call this evolutionary or genetic, because it would be bad.
  • 15. Gives a lot freedom: - choose your operators (depending on the problem) - choose your population-size (depending on your computer/grid ) - choose  (carefully) e.g. min(dimension,  /4) ==> Can work on strange domains
  • 16. Voronoi representation of a shape: - a family of points (thanks Marc S.)
  • 17. Voronoi representation: - a family of points
  • 18. Voronoi representation: - a family of points - their labels
  • 19. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape
  • 20. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape
  • 21. Voronoi representation: - a family of points - their labels ==> cross-over makes sense ==> you can optimize a shape Great substitute for averaging. “on the benefit of sex”
  • 22. Cantilever optimization: Hamda et al, 2000
  • 23. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 25. Parallelism. Thank you G5K Multi-core machines Clusters Grids Sometimes parallelization completely changes the picture. Sometimes not. We want to know when.
  • 26. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Robustness, Sequential decision making slow rates. Fundamental facts Monte-Carlo Tree Search Conclusion
  • 27. Derivative-free optimization of f No gradient ! Only depends on the x's and f(x)'s
  • 28. Derivative-free optimization of f Why derivative free optimization ?
  • 29. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower
  • 30. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative
  • 31. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative It's simpler (by far) ==> less bugs
  • 32. Derivative-free optimization of f Why derivative free optimization ? Ok, it's slower But sometimes you have no derivative It's simpler (by far) It's more robust (to noise, to strange functions...)
  • 33. Derivative-free optimization of f Optimization algorithms ==> Newton optimization ? Why derivative free ==> Quasi-Newton (BFGS) Ok, it's slower But sometimes you have no derivative ==> Gradient descent It's simpler (by far) ==> ...robust (to noise, to strange functions...) It's more
  • 34. Derivative-free optimization of f Optimization algorithms Why derivative free optimization ? Ok, it's slower Derivative-free optimization But sometimes you have no derivative (don't need gradients) It's simpler (by far) It's more robust (to noise, to strange functions...)
  • 35. Derivative-free optimization of f Optimization algorithms Why derivative free optimization ? Derivative-free optimization Ok, it's slower But sometimes you have no derivative Comparison-based optimization (coming soon), It's simpler (by far)comparisons, just needing It's more robust (to noise, to strange functions...) including evolutionary algorithms
  • 36. Comparison-based optimization yi=f(xi) is comparison-based if parallel evolution 36
  • 37. Population-based comparison-based algorithms X(1)=( x(1,1),x(1,2),...,x(1,) ) = Opt() X(2)=( x(2,1),x(2,2),...,x(2,) ) = Opt(x(1), signs of diff) … … ... x(n)=( x(n,1),x(n,2),...,x(n,) ) = Opt(x(n-1), signs of diff) parallel evolution 37
  • 38. P-based c-based algorithms w/ internal state ( X(1)=( x(1,1),x(1,2),...,x(1,) ),I(1) ) = Opt() ( X(2)=( x(2,1),x(2,2),...,x(2,) ),I(2) ) = Opt(x(1),I(1), signs of diff) … … ... ( x(n)=( x(n,1),x(n,2),...,x(n,) ),I(n) ) = Opt(x(n-1),I(n), signs of diff) parallel evolution 38
  • 39. Comparison-based algorithms are robust Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) ==> what if we see g o f (g increasing) ? ==> x* is the same, but xn might change parallel evolution 39
  • 40. Robustness of comparison-based algorithms: formal statement this does not depend on g for a comparison-based algorithm a comparison-based algorithm is optimal for parallel evolution 40
  • 41. Complexity bounds (N = dimension) = nb of fitness evaluations for precision  with probability at least ½ for all f Exp ( - Convergence ratio ) = Convergence rate Convergence ratio ~ 1 / computational cost ==> more convenient than conv. rate for speed-ups parallel evolution 41
  • 42. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 42
  • 43. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 43
  • 44. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 44
  • 45. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 45
  • 46. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 46
  • 47. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 47
  • 48. Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers parallel evolution 48
  • 49. Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Let's consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  ) parallel evolution 49
  • 50. Complexity bounds on the convergence ratio FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) parallel evolution 50
  • 51. Complexity bounds on the convergence ratio This is why I love cross-over. FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) parallel evolution 51
  • 52. Complexity bounds on the convergence ratio Fournier, T., 2009; using VC-dim. FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) parallel evolution 52
  • 53. Complexity bounds on the convergence ratio Quadratic functions easier than sphere functions ? But not for translation invariant quadratic functions... FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked) parallel evolution 53
  • 54. Complexity bounds on the convergence ratio Quadratic functions easier than sphere functions ? But not for translation invariant quadratic functions... FR: full ranking (selected points are ranked) results. Covers existing SB: selection-based (selected pointswith discrete domains. Compliant are not ranked) parallel evolution 54
  • 55. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization 1) Mathematical proof that all Parallelization comparison-based algorithms Noisy cases can be parallelized (log speed-up) Sequential decision making Fundamental facts 2) Practical hint: simple tricks Monte-Carlo Tree Search for some well-known algorithms Conclusion
  • 56. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 1) parallel evolution 56
  • 57. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 2) parallel evolution 57
  • 58. Speculative parallelization with branching factor 3 Consider the sequential algorithm. (iteration 3) parallel evolution 58
  • 59. Speculative parallelization with branching factor 3 Parallel version for D=2. Population = union of all pops for 2 iterations. parallel evolution 59
  • 60. Automatic parallelization Teytaud, T, PPSN 2010 parallel evolution 60
  • 61. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization 1) Mathematical proof that all Parallelization comparison-based algorithms Noisy cases can be parallelized (log speed-up) Sequential decision making Fundamental facts 2) Practical hint: simple tricks Monte-Carlo Tree Search for some well-known algorithms Conclusion
  • 62. Define: Necessary condition for log() speed-up: - E log( * ) ~ log() But for many algorithms, - E log( * ) = O(1) ==> asymptotically constant speed-up
  • 63. These algos do not reach the log(lambda) speed-up. th (1+1)-ES with 1/5 rule Standard CSA Standard EMNA Standard SA. Teytaud, T, PPSN 2010
  • 64. Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points /= log( / 7)1 / d }
  • 65. Ex 2: Log(lambda) correction for mutative self-adapt.  = min( /4,d) While ( I have time ) { Generate points (1,...,) as  x exp(- k.N) Generate points (x1,...,x) distributed as N(x,i) Select the  best points Update x (=mean), update (=log. mean) }
  • 66. Log() corrections (SA, dim 3) ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA) parallel evolution 66
  • 67. Log() corrections ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA) parallel evolution 67
  • 68. SUMMARY of the EA part up to now: - evolutionary algorithms are robust (with a precise statement of this robustness) - evolutionary algorithms are somehow slow (precisely quantified...) - evolutionary algorithms are parallel (at least “until” the dimension for the conv. rate)
  • 69. SUMMARY of the EA part up to now: - evolutionary algorithms are robust (with a precise statement of this robustness) - evolutionary algorithms are somehow slow (precisely quantified...) - evolutionary algorithms are parallel (at least “until” the dimension for the conv. rate) Now, noisy optimization
  • 70. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 71. Many works focus on fitness functions with “small” noise: f(x) = ||x||2 x (1+Gaussian ) This is because the more realistic case f(x) = ||x||2 + Gaussian (variance >0 at optimum) is too hard for publishing nice curves.
  • 72. Many works focus on fitness functions with “small” noise: f(x) = ||x||2 x (1+Gaussian ) This is because the more realistic case f(x) = ||x||2 + Gaussian is too hard for publishing nice curves. ==> see however Arnold Beyer 2006. ==> a tool: races ( Heidrich-Meisner et al, Icml 2009) - reevaluating until statistically significant differences - … but we must (sometimes) limit the number of reevaluations
  • 73. Another difficult case: Bernoulli functions. fitness(x) = B( f(x) ) f(0) not necessarily = 0.
  • 74. Another difficult case: Bernoulli functions. EDA Based on fitness(x) = B( f(x) ) + races MaxUncertainty f(0) not necessarily = 0. (Coulom)
  • 75. Another difficult case: Bernoulli functions. EDA Based on fitness(x) = B( f(x) ) + races MaxUncertainty f(0) not necessarily = 0. (Coulom) I like this case With p=2 with p=2
  • 76. Another difficult case: Bernoulli functions. EDA Based on fitness(x) = B( f(x) ) + races MaxUncertainty f(0) not necessarily = 0. (Coulom) I like this case With p=2 with p=2
  • 77. Another difficult case: Bernoulli functions. EDA Based on fitness(x) = B( f(x) ) + races MaxUncertainty f(0) not necessarily = 0. (Coulom) We prove good results here. I like this case With p=2 with p=2
  • 78. Another difficult case: Bernoulli functions. EDA Based on fitness(x) = B( f(x) ) + races MaxUncertainty f(0) not necessarily = 0. (Coulom) We prove good results here. We prove good I like this case results here. With p=2 with p=2
  • 79. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 80. The game of Go is a part of AI. Computers are ridiculous in front of children. Easy situation. Termed “semeai”. Requires a little bit of abstraction.
  • 81. The game of Go is a part of AI. Computers are ridiculous in front of children. 800 cores, 4.7 GHz, top level program. Plays a stupid move.
  • 82. The game of Go is a part of AI. Computers are ridiculous in front of children. 8 years old; little training; finds the good move
  • 83. Introduction What is AI ? Why evolutionary optimization is a part of AI Why parallelism ? Evolutionary computation Comparison-based optimization Parallelization Noisy cases Sequential decision making Fundamental facts Monte-Carlo Tree Search Conclusion
  • 84. Monte-Carlo Tree Search 1. Games (a bit of formalism) 2. Decidability / complexity Games with simultaneous actions 84 Paris 1st of February
  • 85. A game is a directed graph parallel evolution 85
  • 86. A game is a directed graph with actions 1 2 3 parallel evolution 86
  • 87. A game is a directed graph with actions and players 1 White Black 2 3 White 12 43 White Black Black Black Black parallel evolution 87
  • 88. A game is a directed graph with actions and players and observations Bob Bear Bee Bee 1 White Black 2 3 White 12 43 White Black Black Black Black parallel evolution 88
  • 89. A game is a directed graph with actions and players and observations and rewards Bob Bear Bee Bee 1 White Black 2 +1 3 0 White 12 Rewards 43 White Black on leafs Black only! Black Black parallel evolution 89
  • 90. A game is a directed graph +actions +players +observations +rewards +loops Bob Bear Bee Bee 1 White Black 2 +1 3 0 White 12 43 White Black Black Black Black parallel evolution 90
  • 91. Monte-Carlo Tree Search 1. Games (a bit of formalism) 2. Decidability / complexity Games with simultaneous actions 91 Paris 1st of February
  • 92. Complexity (2P, no random) Unbounded Exponential Polynomial horizon horizon horizon Full Observability EXP EXP PSPACE No obs EXPSPACE NEXP (X=100%) (Hasslum et al, 2000) Partially 2EXP EXPSPACE Observable (Rintanen, 97) (X=100%) Simult. Actions ? EXPSPACE ? <<<= EXP <<<= EXP No obs / PO undecidable
  • 93. Complexity question ? (UD) Instance = position. Question = Is there a strategy which wins whatever are the decisions of the opponent ? = natural question if full observability. Answering this question then allows perfect play.
  • 94. Hummm ? Do you know a PO game in which you can ensure a win with probability 1 ?
  • 95. Complexity question for matrix game ? 100000 010000 Good for column-player ! 001000 ==> but no sure win. 000100 ==> the “UD” question is not 000010 relevant here! 000001
  • 96. Complexity question for Joint work with phantom-games ? F. Teytaud This is phantom-go. Good for black: wins with proba 1-1/(8!) Here, there's no move which ensures a win. But some moves are much better than others!
  • 97. Another formalization c ==> much more satisfactory
  • 98. Madani et al. c 1 player + random = undecidable.
  • 99. Madani et al. 1 player + random = undecidable. We extend to two players with no random. Problem: rewrite random nodes, thanks to additional player.
  • 100. A random node to be rewritten
  • 101. A random node to be rewritten
  • 102. A random node to be rewritten Rewritten as follows: Player 1 chooses a in [[0,N-1]] Player 2 chooses b in [[0,N-1]] c=(a+b) modulo N Go to tc Each player can force the game to be equivalent to the initial one (by playing uniformly) ==> the proba of winning for player 1 (in case of perfect play) is the same as for the initial game ==> undecidability!
  • 103. Important remark Existence of a strategy for winning with proba > 0.5 ==> also undecidable for the restriction to games in which the proba is >0.6 or <0.4 ==> not just a subtle precision trouble.
  • 104. Monte-Carlo Tree Search MCTS principle But with EXP3 in nodes for hidden information.
  • 105. UCT (Upper Confidence Trees) Coulom (06) Chaslot, Saito & Bouzy (06) Kocsis Szepesvari (06)
  • 106. UCT
  • 107. UCT
  • 108. UCT
  • 109. UCT
  • 110. UCT Kocsis & Szepesvari (06)
  • 112. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  • 113. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  • 114. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  • 115. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 )
  • 116. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 ) Binary win/loss games: no explo! (Berthier, D., T., 2010)
  • 117. Games vs pros in the game of Go First win in 9x9 First win over 5 games in 9x9 blind Go First win with H2.5 in 13x13 Go First win with H6 in 19x19 Go First win with H7 in 19x19 Go vs top pro
  • 118. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 ) Simultaneous actions: replace it with EXP3 / INF
  • 119. MCTS for simultaneous actions Player 1 plays Player 2 plays Both players play ... Player 1 plays Player 2 plays
  • 120. MCTS for simultaneous actions Player 1 plays = maxUCB node Player 2 plays Both players play =minUCB node =EXP3 node Player 1 plays ... Player 2 plays =maxUCB node =minUCB node
  • 121. MCTS for hidden information Player 1 Observation set 1 Observation set 2 EXP3 node EXP3 node Observation set 3 EXP3 node Player 2 Observation set 2 Observation set 1 EXP3 node EXP3 node Observation set 3 EXP3 node
  • 122. MCTS for hidden information Player 1 Observation set 1 Observation set 2 EXP3 node EXP3 node Observation set 3 EXP3 node Thanks Martin (incrementally + application to phantom-tic-tac-toe: see D. Auger 2010) Player 2 Observation set 2 Observation set 1 EXP3 node EXP3 node Observation set 3 EXP3 node
  • 123. EXP3 in one slide Grigoriadis et al, Auer et al, Audibert & Bubeck Colt 2009
  • 124. Monte-Carlo Tree Search Appli to Urban Rivals ==> (simultaneous actions) Games with simultaneous actions 124 Paris 1st of February
  • 125. Let's have fun with Urban Rivals (4 cards) Each player has - four cards (each one can be used once) - 12 pilz (each one can be used once) - 12 life points Each card has: - one attack level - one damage - special effects (forget that...) Four turns: P1 attacks P2, P2 attacks P1, P1 attacks P2, P2 attacks P1. parallel evolution 125
  • 126. Let's have fun with Urban Rivals First, attacker plays: - chooses a card - chooses ( PRIVATELY ) a number of pilz Attack level = attack(card) x (1+nb of pilz) Then, defender plays: - chooses a card - chooses a number of pilz Defense level = attack(card) x (1+nb of pilz) Result: If attack > defense Defender looses Power(attacker's card) Else Attacker looses Power(defender's card) parallel evolution 126
  • 127. Let's have fun with Urban Rivals ==> The MCTS-based AI is now at the best human level. Experimental (only) remarks on EXP3: - discard strategies with small number of sims = better approx of the Nash - also an improvement by taking into account the other bandit - virtual simulations (inspired by Kummer) parallel evolution 127
  • 128. When is MCTS relevant ? Robust in front of: High dimension; Non-convexity of Bellman values; Complex models Delayed reward Simultaneous actions, partial information More difficult for High values of H; Model-free Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree Search, see Cazenave et al.) Lack of reasonable baseline for the MC
  • 129. When is MCTS relevant ? T., Dagstuhl 2010, D. Auger, Robust in front of: EvoStar 2011. EvoStar 2011; High dimension; Unpublished Non-convexity of Bellman values; Complex models results on Delayed reward Some endgames undecidability Simultaneous actions More difficult for results High values of H; Model-free Highly unobservable cases (Monte-Carlo, but not Monte-Carlo Tree Search, see Cazenave et al.) Lack of reasonable baseline for the MC
  • 130. Conclusion Evo. Opt: robustness, tight bounds, simple algorithmic modifs for better speed-up (SA, 1/5th, (CSA)) MCTS just great (but requires a model); UCB not necessary; extension to hidden info (rmk: undecidability); PO endgames; but no abstraction power. Noisy optimization: Consider high noise. Use QR and Learning (in all EA in fact). Not mentioned here: multimodal, multiobj, GP, bandits.
  • 131. Future ? - Solving semeais ? Would involve great AI progress I think... - Noisy optimization; there are still things to be done. ==> Promoting high noise fitness functions even if it is less publication-efficient. - ``Inheritance'' of belief state in partially observable games. Big progress to be done. Crucial for applications. - Sparse bandits / mixed stochastic/adversarial cases. Thanks for your attention. Thanks to all collaborators for all I've learnt with them.
  • 132. Appendix 1: MCTS with hidden information
  • 133. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 134. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 135. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 136. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 137. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 138. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 139. MCTS with hidden information: incremental version While (there is time for thinking) { s=initial state os(1)=() os(2)=() while (s not terminal) { p=player(s) b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }
  • 140. MCTS with hidden information: incremental version While (there is time for thinking) { Possibly s=initial state os(1)=() os(2)=() refine while (s not terminal) the family { p=player(s) of bandits. b=Exp3Bandit(os(p)) d=b.makeDecision (s,o)=transition(s,d) } send reward to all bandits in the simulation }

Hinweis der Redaktion

  1. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  2. I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse