1. Stochastic Local Search in Continuous Domain
Petr Pošík
posik@labe.felk.cvut.cz
Czech Technical University in Prague
Faculty of Electrical Engineering
Department of Cybernetics
Intelligent Data Analysis Group
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 1 / 25
2. Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
Motivation
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 2 / 25
3. Why local search?
Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
There’s something about population:
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 3 / 25
4. Why local search?
Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
There’s something about population:
data set forming a basis for offspring creation
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 3 / 25
5. Why local search?
Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
There’s something about population:
data set forming a basis for offspring creation
allows for searching the space in several places at once
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 3 / 25
6. Why local search?
Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
There’s something about population:
data set forming a basis for offspring creation
allows for searching the space in several places at once
(replaced by restarted local search with adaptive neighborhood)
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 3 / 25
7. Why local search?
Motivation
Why local search?
Agenda
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
There’s something about population:
data set forming a basis for offspring creation
allows for searching the space in several places at once
(replaced by restarted local search with adaptive neighborhood)
Hypothesis:
The data set (population) is very useful when creating (sometimes implicit) global
model of the fitness landscape or a local model of the neighborhood.
It is often better to have a superb adaptive local search procedure and restart it,
than to deal with a complex global search algorithm.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 3 / 25
8. Agenda
Motivation
Why local search? 1. Adaptation in stochastic local search:
Agenda
Roles of population and model
Introduction
Notable examples of
Notable examples of local search based on EAs
local search based on EA
ideas Personal history in the field of real-valued EDAs
Personal history in the
field of real-valued 2. Features of stochastic local search in continuous domain
EDAs
Survey of relevant works in the article in proceedings
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 4 / 25
9. Motivation
Introduction
Relation of local search
and EAs (EDAs)
Stochastic Local Search
Roles of population and
model
Unifying view
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
Introduction
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 5 / 25
10. Relation of local search and EAs (EDAs)
Classification of optimization techniques [Neu04]:
incomplete: no safeguards against getting stuck in a local optimum
assymptotically complete: reaches global optimum with certainty (or with probability one) if
allowed to run indefinitely long, but has no means to know when a global optimum has been found.
complete: reaches global optimum with certainty if allowed to run indefinitely long, and knows after
finite time if an approximate optimum has been found (within specified tolerances).
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 6 / 25
11. Relation of local search and EAs (EDAs)
Classification of optimization techniques [Neu04]:
incomplete: no safeguards against getting stuck in a local optimum
assymptotically complete: reaches global optimum with certainty (or with probability one) if
allowed to run indefinitely long, but has no means to know when a global optimum has been found.
complete: reaches global optimum with certainty if allowed to run indefinitely long, and knows after
finite time if an approximate optimum has been found (within specified tolerances).
Practical point of view:
Judging an algorithm based on its behaviour, not on its functional parts.
EAs: EDAs:
population
population data source for offs. creation
data source for offs. creation
selection model building
crossover model sampling (with explicit model)
(with implicit model)
mutation
When can an EA with one of these procedures be described as local search?
When the distribution of offspring produced by the respective data source is single-peak (unimodal).
[Neu04] Arnold Neumaier. Complete search in continuous global optimization and constraint satisfaction. Acta Numerica, 13:271–369, May 2004.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 6 / 25
12. Stochastic Local Search
Motivation
Term coined by Holger Hoos and Thomas Stuetzle [HS04]:
Introduction
Relation of local search
and EAs (EDAs)
Stochastic Local Search
Roles of population and
model
Unifying view
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
originally used in the combinatorial optimization settings
the term nicely describes EDAs with single-peak probability distributions
[HS04] Holger H. Hoos and Thomas Stützle. Stochastic Local Search : Foundations & Applications. The Morgan Kaufmann Series in Artificial
Intelligence. Morgan Kaufmann, 2004.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 7 / 25
13. Roles of population and model
Observation:
Algorithm 1: Evol. scheme in discrete domains
1 begin
2 X (0) ← InitializePopulation()
3 f (0) ← Evaluate(X (0) )
4 g←1
5 while not TerminationCondition() do
6 S ← Select(X ( g−1) , f ( g−1) )
7 M ← Build(S )
8 XOffs ← Sample(M)
9 f Offs ← Evaluate (XOffs )
10 { X ( g) , f ( g) } ←
11 Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 8 / 25
14. Roles of population and model
Observation:
Algorithm 1: Evol. scheme in discrete domains
1 begin
2 X (0) ← InitializePopulation()
3 f (0) ← Evaluate(X (0) )
4 g←1
5 while not TerminationCondition() do
6 S ← Select(X ( g−1) , f ( g−1) )
7 M ← Build(S )
8 XOffs ← Sample(M)
9 f Offs ← Evaluate (XOffs )
10 { X ( g) , f ( g) } ←
11 Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
Population is evolved (adapted)
Model is used as a single-use processing unit
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 8 / 25
15. Roles of population and model
Observation:
Algorithm 1: Evol. scheme in discrete domains
1 begin
2 X (0) ← InitializePopulation()
3 f (0) ← Evaluate(X (0) )
4 g←1
5 while not TerminationCondition() do
6 S ← Select(X ( g−1) , f ( g−1) )
7 M ← Build(S )
8 XOffs ← Sample(M)
9 f Offs ← Evaluate (XOffs )
10 { X ( g) , f ( g) } ←
11 Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
Population is evolved (adapted)
Model is used as a single-use processing unit
What happens if we
use generational repacement and
update the model instead of building it from
scratch?
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 8 / 25
16. Roles of population and model
Observation:
Algorithm 1: Evol. scheme in discrete domains Algorithm 2: Evol. scheme in cont. domains
1 begin 1 begin
2 X (0) ← InitializePopulation() 2 M(1) ← InitializeModel()
3 f (0) ← Evaluate(X (0) ) 3 g←1
4 g←1 4 while not TerminationCondition() do
5 while not TerminationCondition() do 5 X ← Sample(M( g) )
6 S ← Select(X ( g−1) , f ( g−1) ) 6 f ← Evaluate (X)
7 M ← Build(S ) 7 S ← Select(X, f )
8 XOffs ← Sample(M) 8 M( g+1) ← Update(g, M( g) , X, f , S )
9 f Offs ← Evaluate (XOffs ) 9 g ← g+1
10 { X ( g) , f ( g) } ←
11 Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
Population is evolved (adapted)
Model is used as a single-use processing unit
What happens if we
use generational repacement and
update the model instead of building it from
scratch?
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 8 / 25
17. Roles of population and model
Observation:
Algorithm 1: Evol. scheme in discrete domains Algorithm 2: Evol. scheme in cont. domains
1 begin 1 begin
2 X (0) ← InitializePopulation() 2 M(1) ← InitializeModel()
3 f (0) ← Evaluate(X (0) ) 3 g←1
4 g←1 4 while not TerminationCondition() do
5 while not TerminationCondition() do 5 X ← Sample(M( g) )
6 S ← Select(X ( g−1) , f ( g−1) ) 6 f ← Evaluate (X)
7 M ← Build(S ) 7 S ← Select(X, f )
8 XOffs ← Sample(M) 8 M( g+1) ← Update(g, M( g) , X, f , S )
9 f Offs ← Evaluate (XOffs ) 9 g ← g+1
10 { X ( g) , f ( g) } ←
11 Replace(X ( g−1) , XOffs , f ( g−1) , f Offs ) Model is evolved (adapted)
12 g ← g+1
Population is used only as a data set
allowing us to gather some information
Population is evolved (adapted) about the fitness landscape
Model is used as a single-use processing unit
What happens if we
use generational repacement and
update the model instead of building it from
scratch?
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 8 / 25
18. Unifying view
Motivation
Introduction Algorithm 3: General Evolutionary Scheme
Relation of local search
and EAs (EDAs) 1 begin
Stochastic Local Search
Roles of population and 2 M(0) ← InitializeModel()
model
Unifying view
3 X (0) ← Sample(M(0) )
Notable examples of 4 f (0) ← Evaluate(X (0) )
local search based on EA 5 g←1
ideas
6 while not TerminationCondition() do
Personal history in the
field of real-valued 7 {S , D} ← Select(X ( g−1) , f ( g−1) )
EDAs
8 M( g) ← Update(g, M( g−1) , X ( g−1) , f ( g−1) , S , D )
9 XOffs ← Sample(M( g) )
10 f Offs ← Evaluate (XOffs )
11 { X ( g) , f ( g) } ← Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 9 / 25
19. Unifying view
Motivation
Introduction Algorithm 3: General Evolutionary Scheme
Relation of local search
and EAs (EDAs) 1 begin
Stochastic Local Search
Roles of population and 2 M(0) ← InitializeModel()
model
Unifying view
3 X (0) ← Sample(M(0) )
Notable examples of 4 f (0) ← Evaluate(X (0) )
local search based on EA 5 g←1
ideas
6 while not TerminationCondition() do
Personal history in the
field of real-valued 7 {S , D} ← Select(X ( g−1) , f ( g−1) )
EDAs
8 M( g) ← Update(g, M( g−1) , X ( g−1) , f ( g−1) , S , D )
9 XOffs ← Sample(M( g) )
10 f Offs ← Evaluate (XOffs )
11 { X ( g) , f ( g) } ← Replace(X ( g−1) , XOffs , f ( g−1) , f Offs )
12 g ← g+1
both the population and the model are evolved (adapted)
DANGER: using “the same information” over and over to adapt the model (part of
the population may stay the same over several generations)
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 9 / 25
20. Motivation
Introduction
Notable examples of
local search based on EA
ideas
Building-block-wise
mutation algorithm
Binary local search with
linkage identification
Building-block
hill-climber
CMA-ES
G3PCX
Summary
Notable examples of local search based on EA ideas
Personal history in the
field of real-valued
EDAs
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 10 / 25
21. Building-block-wise mutation algorithm
Motivation
Sastry and Goldberg [SG07]
Introduction
Notable examples of compared BBMA with selecto-recombinative GA on a class of nonuniformly scaled
local search based on EA ADFs
ideas
Building-block-wise
mutation algorithm
assumed that BB information is known
Binary local search with
linkage identification
showed that
Building-block
hill-climber $ in noiseless conditions BBMA is faster, while
CMA-ES
$ in noisy conditions selecto-recombinative GA is faster
G3PCX
Summary
Personal history in the
field of real-valued
EDAs
[SG07] Kumara Sastry and David E. Goldberg. Let’s get ready to rumble redux: crossover versus mutation head to head on exponentially
scaled problems. In GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, pages 1380–1387,
New York, NY, USA, 2007. ACM.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 11 / 25
22. Binary local search with linkage identification
Motivation
Vanicek [Van10]
Introduction
Notable examples of Binary local search (actually BBMA) completed with LIMD
local search based on EA
ideas Linkage identification by non-monotonicity check [MG99]
Building-block-wise
mutation algorithm works well on ADFs, fails on hierarchical functions
Binary local search with
Graph of reliability (function: k*5bitTrap) Graph of reliability (function: k*8bitTrap)
linkage identification 5
10
7
10
Building-block
hill-climber
6
CMA-ES 4
10
10
G3PCX
Summary 5
10
evaluations
evaluations
Personal history in the 3
10
field of real-valued 4
EDAs 10
2
10
3
LIMD bsf 10 LIMD bsf
random random
BOA BOA
ECGA ECGA
1 2
10 10
0 10 20 30 40 50 60 70 0 20 40 60 80 100 120
dim dim
[MG99] Masaharu Munetomo and David E. Goldberg. Linkage identification by non-monotonicity detection for overlapping functions.
Evolutionary Computation, 7(4):377–398, 1999.
[Van10] Stanislav Vaníˇ ek. Binary local optimizer with linkage learning. Technical report, Czech Technical University in Prague, Prague,
c
Czech Republic, 2010.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 12 / 25
23. Building-block hill-climber
Motivation
Iclanzan and Dumitrescu [ID07]
Introduction
Notable examples of similar to BBMA
local search based on EA
ideas uses compact genetic codes
Building-block-wise
mutation algorithm beats hBOA on hierarchical functions (hIFF, hXOR, hTrap)
Binary local search with
linkage identification
Building-block
hill-climber
CMA-ES
G3PCX
Summary
Personal history in the
field of real-valued
EDAs
[ID07] David Iclanzan and Dan Dumitrescu. Overcoming hierarchical difficulty by hill-climbing the building block structure. In GECCO
’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, pages 1256–1263, New York, NY, USA, 2007. ACM.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 13 / 25
24. CMA-ES
Motivation
Hansen and Ostermeier [HO01]
Introduction
Notable examples of based on evolutionary strategy
local search based on EA
ideas $ (1 + 1)-ES (mutative, parent-centric) searches neighborhood of 1 point
Building-block-wise
mutation algorithm $ (1 + λ)-ES (mutative, parent-centric) searches neighborhood of 1 point
,
Binary local search with
linkage identification $ (µ + λ)-ES (mutative, parent-centric) searches neighborhood of several points
,
Building-block
hill-climber $ (µ/ρ + λ)-ES (recombinative, between parent-centric and mean-centric)
,
CMA-ES
G3PCX
searches neighborhood of several points
Summary $ CMA-ES is actually (µ/µ, λ)-ES (recombinative, mean-centric) searches
Personal history in the neighborhood of 1 point
field of real-valued
EDAs
[HO01] Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary
Computation, 9(2):159–195, 2001.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 14 / 25
25. G3PCX
Generalized generation gap by Deb in [Deb05]
Algorithm 4: Generalized Generation Gap
Input:
number of parents µ,
number of offspring λ,
number of replacement candidates r
1 begin
2 B ← initialize population of size N
3 while not TerminationCondition() do
4 P ← select µ parents from B : select the best
population member and µ − 1 other parents
uniformly
5 C ← generate λ offspring from the selected
parents P using any chosen recombination
scheme
6 R ← choose a r members of population B
uniformly as candidates for replacement
7 B ← replace R in B by the best r members of
R∪C
claimed to be more efficient than CMA-ES on
three 20D functions
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 15 / 25
26. G3PCX
Generalized generation gap by Deb in [Deb05] Parent-centric crossover [DAJ02]
PCX with µ = 3 and large λ
Algorithm 4: Generalized Generation Gap
Input: 2
number of parents µ,
1.5
number of offspring λ,
1
number of replacement candidates r
1 begin 0.5
2 B ← initialize population of size N
3 while not TerminationCondition() do 0
4 P ← select µ parents from B : select the best
population member and µ − 1 other parents −0.5
uniformly
5 C ← generate λ offspring from the selected −1
parents P using any chosen recombination −0.5 0 0.5 1 1.5 2 2.5 3 3.5
scheme
6 R ← choose a r members of population B
uniformly as candidates for replacement
7 B ← replace R in B by the best r members of
R∪C
claimed to be more efficient than CMA-ES on
three 20D functions
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 15 / 25
27. G3PCX
Generalized generation gap by Deb in [Deb05] Parent-centric crossover [DAJ02]
PCX with µ = 3 and large λ
Algorithm 4: Generalized Generation Gap
Input: 2
number of parents µ,
1.5
number of offspring λ,
1
number of replacement candidates r
1 begin 0.5
2 B ← initialize population of size N
3 while not TerminationCondition() do 0
4 P ← select µ parents from B : select the best
population member and µ − 1 other parents −0.5
uniformly
5 C ← generate λ offspring from the selected −1
parents P using any chosen recombination −0.5 0 0.5 1 1.5 2 2.5 3 3.5
scheme
6 R ← choose a r members of population B Local-search-intensive variant used:
uniformly as candidates for replacement the best pop. member is always selected as a
7 B ← replace R in B by the best r members of
parent, and
R∪C
the best pop. member is always selected as
claimed to be more efficient than CMA-ES on the distribution center.
three 20D functions
[DAJ02] Kalyanmoy Deb, Ashish Anand, and Dhiraj Joshi. A computationally efficient evolutionary algorithm for real-parameter optimization. Technical report,
Indian Institute of Technology, April 2002.
[Deb05] K. Deb. A population-based algorithm-generator for real-parameter optimization. Soft Computing, 9(4):236–253, April 2005.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 15 / 25
28. Summary
Motivation
Introduction
Notable examples of
local search based on EA
ideas
Building-block-wise
mutation algorithm
Binary local search with
linkage identification
Building-block
hill-climber
CMA-ES
G3PCX
Summary
“By borrowing ideas from EAs and building local search techniques based on them,
Personal history in the
we can arrive at pretty efficient algorithms,
field of real-valued which usually have less parameters to tune.”
EDAs
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 16 / 25
29. Motivation
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
Distribution Tree
Linear coordinate
transformations
Non-linear global
transformation
Estimation of contour
lines of the fitness Personal history in the field of real-valued EDAs
function
Variance enlargement in
simple EDA
Features of simple EDAs
Final summary
Thanks for your
attention
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 17 / 25
30. Distribution Tree
Motivation
Distribution Tree-Building Real-valued EA [Poš04]
Introduction
Griewangk function Rosenbrock function
Notable examples of 5
local search based on EA 2
ideas 4
1.5
Personal history in the 3
field of real-valued 1
EDAs 2
Distribution Tree 1 0.5
Linear coordinate
transformations 0 0
Non-linear global
−1
transformation −0.5
Estimation of contour −2
lines of the fitness −1
function −3
Variance enlargement in −1.5
simple EDA −4
Features of simple EDAs −5 −2
−5 0 5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2
Final summary
Thanks for your
attention Identifies hyper-rectangular areas of the search space with significantly different
densities.
Does not work well if the promising areas are not aligned with the coordinate axes.
Need some coordinate transformations?
[Poš04] Petr Pošík. Distribution tree–building real-valued evolutionary algorithm. In Parallel Problem Solving From Nature — PPSN VIII,
pages 372–381, Berlin, 2004. Springer. ISBN 3-540-23092-0.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 18 / 25
31. Linear coordinate transformations
No tranformation vs. PCA vs. ICA [Poš05]
PC 1 PC 2 PC 1 PC 2
6 6
5 5
4 4
0 0
2 2
−5 −5
0 0
0 2 4 6 0 2 4 6 −10 0 10 −10 0 10
IC 1 IC 2 IC 1 IC 2
6 6
5 5
4 4
0 0
2 2
−5 −5
0 0 −10 0 10 −10 0 10
0 2 4 6 0 2 4 6
Results are different, but the difference does not Results are different and the difference matters!
matter.
The global information extracted by linear tranformation procedures often was not useful.
Need for non-linear transformation or local transformations?
[Poš05] Petr Pošík. On the utility of linear transformations for population-based optimization algorithms. In Preprints of the 16th World Congress of the International
Federation of Automatic Control, Prague, 2005. IFAC. CD-ROM.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 19 / 25
32. Non-linear global transformation
Motivation
Kernel PCA as transformation technique in EDA [Poš04]
Introduction
Notable examples of
local search based on EA Training data points
ideas 8 Data points sampled from KPCA
Personal history in the
field of real-valued 7
EDAs
6
Distribution Tree
Linear coordinate
5
transformations
Non-linear global
transformation 4
Estimation of contour
lines of the fitness 3
function
Variance enlargement in 2
simple EDA
1
Features of simple EDAs
Final summary
0 2 4 6 8 10
Thanks for your
attention
Works too well:
It reproduces the pattern with high fidelity
If the population is not centered around the optimum, the EA will miss it
Need for efficient population shift?
Is the MLE principle suitable for model building in EAs?
[Poš04] Petr Pošík. Using kernel principal components analysis in evolutionary algorithms as an efficient multi-parent crossover operator.
In IEEE 4th International Conference on Intelligent Systems Design and Applications, pages 25–30, Piscataway, 2004. IEEE. ISBN
963-7154-29-9.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 20 / 25
33. Estimation of contour lines of the fitness function
Build a quadratic classifier separating the selected and the discarded individuals [PF07]
1 1 1
0 0 0
−1 −1 −1
−2 −2 −2
−3 −3 −3
−4 −4 −4
−2 −1 0 1 2 3 −2 −1 0 1 2 3 −2 −1 0 1 2 3
Ellipsoid Function
Classifier built by modified perceptron 10
10
CMA−ES
algorithm or by semidefinite programming Perceptron
SDP
Works well for pure quadratic functions 5
10
Average BSF Fitness
If the selected and discarded individuals are
not separable by an ellipsoid, the training 0
10
procedure fails to create a good model
Not solved yet −5
10
−10
10
0 1000 2000 3000 4000 5000 6000
Number of Evaluations
[PF07] Petr Pošík and Vojtˇ ch Franc. Estimation of fitness landscape contours in EAs. In GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary
e
computation, pages 562–569, New York, NY, USA, 2007. ACM Press.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 21 / 25
34. Variance enlargement in simple EDA
Variance adaptation is often used. Is a constant variance multiplier a viable alternative? [Poš08]
Minimal requirements for a successful real-valued EDA
$ the model must converge if centered around optimum
$ the model must not converge if set on the slope
Is there a single value k of multiplier for MLE variance estimate that would ensure the reasonable
behaviour just mentioned?
Does it depend on the single-peak distribution being used?
1 1 1
10 10 10
0
10
k
k
k
−1 kmax, τ = 0.1
10 kmax, τ = 0.3
0 0
10 10 kmax, τ = 0.5
kmax, τ = 0.7
kmax kmax kmax, τ = 0.9
kmin , τ = 0.1 kmin , τ = 0.1 −2 kmin , τ = 0.1
10
kmin , τ = 0.3 kmin , τ = 0.3 kmin , τ = 0.3
kmin , τ = 0.5 kmin , τ = 0.5 kmin , τ = 0.5
kmin , τ = 0.7 kmin , τ = 0.7 kmin , τ = 0.7
kmin , τ = 0.9 kmin , τ = 0.9 −3
kmin , τ = 0.9
0 1 0 1
10 0 1
10 10 10 10 10 10
dim dim dim
For Gaussian and “isotropic Gaussian”, allowable k is hard or impossible to find.
For isotropic Cauchy, allowable k seems to always exist.
[Poš08] Petr Pošík. Preventing premature convergence in a simple EDA via global step size setting. In Günther Rudolph, editor, Parallel Problem Solving from Nature –
PPSN X, volume 5199 of Lecture Notes in Computer Science, pages 549–558. Springer, 2008.
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 22 / 25
35. Features of simple EDAs
Motivation
Consider a simple EDA using the following sampling mechanism:
Introduction
Notable examples of
local search based on EA
zi ∼ P ,
ideas
xi = µ + R × diag(σ ) × (c · zi ).
Personal history in the
field of real-valued
EDAs
1. What kind of base distribution P is used for sampling?
Distribution Tree
Linear coordinate
transformations 2. Is the type of distribution fixed during the whole evolution?
Non-linear global
transformation
Estimation of contour
3. Is the model re-estimated from scratch each generation? Or is it updated
lines of the fitness incrementaly?
function
Variance enlargement in
simple EDA
4. Does the model-building phase use selected and/or discarded individuals?
Features of simple EDAs
Final summary 5. Where do you place the sampling distribution in the next generation?
Thanks for your
attention 6. When and how much (if at all) should the distribution be enlarged?
7. What should the reference point be? What should the orientation of the
distribution be?
See the survey of SLS algorithms and their features in the article in proceedings.
http://portal.acm.org/citation.cfm?id=1830761.1830830
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 23 / 25
36. Final summary
Motivation
Introduction
It seems that by borrowing ideas from EC community and incorporating them back
Notable examples of
into local search methods we can get very efficient algorithms. This seems to be the
local search based on EA case especially for continuous domains.
ideas
Personal history in the In the same time, it is important to study where are the limits of such methods.
field of real-valued
EDAs Comparison with state-of-the-art techniques.
Distribution Tree
Linear coordinate
transformations Black-box optimization benchmarking workshop
Non-linear global http://coco.gforge.inria.fr/doku.php?id=bbob-2010
transformation
Estimation of contour
lines of the fitness
set of benchmark functions (noiseless and noisy, unimodal and multimodal,
function well-conditioned and ill-conditioned, structured and unstructured)
Variance enlargement in
simple EDA expected running time of the algorithm is used as the main measure of
Features of simple EDAs
performance
Final summary
Thanks for your set of postprocessing scripts which produce many nice and information-dense
attention
figures and tables
set of latex article templates
many algorithms to compare with already benchmarked, their data freely
available!!!
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 24 / 25
37. Thanks for your attention
Motivation
Introduction
Notable examples of
local search based on EA
ideas
Personal history in the
field of real-valued
EDAs
Distribution Tree
Linear coordinate
transformations
Non-linear global
transformation
Estimation of contour
lines of the fitness
function
Variance enlargement in
simple EDA
Features of simple EDAs
Final summary Any questions?
Thanks for your
attention
P. Pošík c GECCO 2010, OBUPM Workshop, Portland, 7.-11.7.2010 25 / 25