Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems

1 / 28GECCO 2014, Vancouver, Canada, July 14
Introduction Background Contribution Experiments
Conclusions &
Future Work
Efficient Identification of Improving Moves
in a Ball for Pseudo-Boolean Problems
Francisco Chicano, Darrell Whitley, Andrew M. Sutton

Conclusions &
Future Work
r = 1 n
r = 2 n
2
r = 3 n
3
r n
r
Ball
Pr
i=1
n
i
S1(
r = 1 n
r = 2 n
2
r = 3 n
3
r n
r
Ball
Pr
i=1
n
i
• Considering binary strings of length n and Hamming distance…
Solutions in a ball of radius r
r=1
r=2
r=3
Ball of radius r Improving moves Previous work
How many solutions at Hamming distance r?
If r << n : Θ (nr)

Conclusions &
Future Work
• We want to find improving moves in a ball of radius r around solution x
• What is the computational cost of this exploration?
• By complete enumeration: O (nr) if the fitness evaluation is O(1)
• Our contribution in this work:
Improving moves in a ball of radius r
r
We propose a way to find improving moves in ball of
radius r in O(1) (constant time independent of n)

Conclusions &
Future Work
• Whitley and Chen proposed an O(1) approximated steepest descent for MAX-kSAT
and NK-landscapes based on Walsh decomposition
• For k-bounded pseudo-Boolean functions its complexity is O(k2 2k)
• Chen, Whitley, Hains and Howe reduced the time required to identify improving moves
to O(k3) using partial derivatives
• Szeider proved that the exploration of a ball of radius r in MAX-kSAT and kSAT can be
done in O(n) if each variable appears in a bounded number of clauses
• Our result can be obtained by Walsh analysis or partial derivatives, but none of them
will be used here
Previous work
D. Whitley and W. Chen. Constant time steepest descent local search with
lookahead for NK-landscapes and MAX-kSAT. GECCO 2012: 1357–1364
W. Chen, D. Whitley, D. Hains, and A. Howe. Second order partial derivatives
for NK-landscapes. GECCO 2013: 503–510
S. Szeider. The parameterized complexity of k-flip local search for SAT and
MAX SAT. Discrete Optimization, 8(1):139–145, 2011

Conclusions &
Future Work
• Definition:
• where f(i) only depends on k variables (k-bounded epistasis)
• We will also assume that the variables are arguments of at most c subfunctions
• Example (m=4, n=4, k=2):
• Is this set of functions too small? Is it interesting?
• Max-kSAT is a k-bounded pseudo-Boolean optimization problem
• NK-landscapes is a (K+1)-bounded pseudo-Boolean optimization problem
• Any compressible pseudo-Boolean function can be reduced to a quadratic
pseudo-Boolean function (e.g., Rosenberg, 1975)
k-bounded pseudo-Boolean functions
Pseudo-Boolean functions Scores
The family of k-bounded pseudo-Boolean Optimization
problems have also been described as an embedded landscape.
An embedded landscape [3] with bounded epistasis k is de-
ﬁned as a function f(x) that can be written as the sum
of m subfunctions, each one depending at most on k input
variables. That is:
f(x) =
mX
i=1
f(i)
(x), (1)
where the subfunctions f(i)
depend only on k components
of x. Embedded Landscapes generalize NK-landscapes and
the MAX-kSAT problem. We will consider in this paper that
the number of subfunctions is linear in n, that is m 2 O(n).
For NK-landscapes m = n and is a common assumption in
MAX-kSAT that m 2 O(n).
3. SCORES IN THE HAMMING BALL
For v, x 2 Bn
, and a pseudo-Boolean function f : Bn
! R,
we denote the Score of x with respect to move v as Sv(x),
deﬁned as follows:1
Sv(x) = f(x v) f(x), (2)
1
We omit the function f in Sv(x) to simplify the notation.
S(l)
v (x) =
Equation (5) cl
change in the mov
f(l)
the Score of th
this subfunction w
On the other hand
we only need to c
changed variables
acterized by the m
we can write (3) a
S
3.1 Scores De
The Score value
tion than just the c
in that ball. Let us
balls of radius r =
xj are two variabl
ments of any subfu
f = + + +f(1)(x) f(2)(x) f(3)(x) f(4)(x)
x1 x2 x3 x4

Conclusions &
Future Work
• Let us represent a potential move of the current solution with a binary vector v having
1s in the positions that should be flipped
• The score of move v for solution x is the difference in the fitness value of the
neighboring and the current solution
• Scores are useful to identify improving moves: if Sv(x) > 0, v is an improving move
• We keep all the scores in the score vector
Scores: definition
Current solution, x Neighboring solution, y Move, v
01110101010101001 01111011010101001 00001110000000000
01110101010101001 00110101110101111 01000000100000110
01110101010101001 01000101010101001 00110000000000000
er of subfunctions is linear in n, that is m
andscapes m = n and is a common assu
AT that m 2 O(n).
ORES IN THE HAMMING BA
x 2 Bn
, and a pseudo-Boolean function f
e the Score of x with respect to move v
s follows:1
Sv(x) = f(x v) f(x),
t the function f in Sv(x) to simplify the
Pseudo-Boolean functions Scores

Conclusions &
Future Work
• The key idea of our proposal is to compute the scores from scratch once at the
beginning and update their values as the solution moves (less expensive)
Scores update
Main idea Decomposition of scores Constant time update
r
Selected improving move
Update the score vector

Conclusions &
Future Work
• The key idea of our proposal is to compute the scores from scratch once at the
beginning and update their values as the solution moves (less expensive)
• How can we do it less expensive?
• We have still O(nr) scores to update!
• … thanks to two key facts:
•  We don’t need all the O(nr) scores to know if there is an improving move
•  From the ones we need, we only have to update a constant number of them and
we can do each update in constant time
Key facts for efficient scores update
r

Conclusions &
Future Work
Examples: 1 and 4
f(1) f(2) f(3) f(4)
x1 x2 x3 x4
Ball
Pr
i=1
n
i
S1(x) = f(x 1) f(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
f(1) f(2) f(3) f(4)
x1 x2 x3 x4
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)

Conclusions &
Future Work
Examples: 1 and 4
f(1)
x1 x2
Ball
Pr
i=1
n
i
S1(x) = f(x 1) f(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
f(3) f(4)
x2 x3 x4
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)

Conclusions &
Future Work
Example: 1,4
f(1) f(2) f(3) f(4)
x1 x2 x3 x4
r
Ball
Pr
i=1
n
i
S1(x) = f(x 1) f(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
n
i
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)

Conclusions &
Future Work
Example: 1,4
f(1) f(3) f(4)
x1 x2 x3 x4
r
Ball
Pr
i=1
n
i
S1(x) = f(x 1) f(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
n
i
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1(x) = f(x 1) f(x)
f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
Ball
Pr
i=1
n
i
S1(x) =
Sv(x) = f(x v) f(x) =
S4(x) =
S1(x) = f(x 1) f(x)
x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
We don’t need to store S1,4(x) since can be computed from others
If none of 1 and 4 are improving moves, 1,4 will not be an improving move

Conclusions &
Future Work
Example: 1,2
S1(x) = f(x 1) f(x)
v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
S1(x) = f(1)
(x 1) f(1)
(x)
f(1) f(2) f(3)
x1 x2 x3 x4
f(1) f(2) f(3)
x1 x2 x3 x4
f(1)
x1 x2
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
S1(x) = f(1)
(x 1) f(1)
(x)
S2(x) = f(1)
(x 2) f(1)
(x) + f(2)
(x 2) f(2)
(x) + f(3)
(x 2) f(3)
(x)
S1,2(x) = f(1)
(x 1, 2) f(1)
(x)+f(2)
(x 1, 2) f(2)
(x)+f(3)
(x 1, 2) f(3)
(x)
S1,2(x) 6= S1(x) + S2(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
S1(x) = f(1)
(x 1) f(1)
(x)
S2(x) = f(1)
(x 2) f(1)
(x) + f(2)
(x 2) f(2)
(x) + f(3)
(x 2) f(3)
(x)
S1,2(x) = f(1)
(x 1, 2) f(1)
(x)+f(2)
(x 1, 2) f(2)
(x)+f(3)
(x 1, 2) f(3)
(x)
S1,2(x) 6= S1(x) + S2(x)
Sv(x) = f(x v) f(x) =
mX
l=1
(f(l)
(x v) f(l)
(x)) =
mX
l=1
S(l)
(x)
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
S1(x) = f(1)
(x 1) f(1)
(x)
S2(x) = f(1)
(x 2) f(1)
(x) + f(2)
(x 2) f(2)
(x) + f(3)
(x 2) f(3)
(x)
S1,2(x) = f(1)
(x 1, 2) f(1)
(x)+f(2)
(x 1, 2) f(2)
(x)+f(3)
(x 1, 2) f(3)
(x)
S1,2(x) 6= S1(x) + S2(x)
x1 and x2
“interact”

Conclusions &
Future Work
Decomposition rule for scores
• When can we decompose a score as the sum of lower order scores?
• … when the variables in the move can be partitioned in subsets of variables that
DON’T interact
• Let us define the Variable Interactions Graph (VIG)
f(1) f(2) f(3) f(4)
x1 x2 x3 x4
There is an edge between two variables
if there exists a function that depends
on both variables (they “interact”)
x4 x3
x1 x2

Conclusions &
Future Work
Scores to store
• In terms of the VIG a score can be decomposed if the subgraph containing the
variables in the move is NOT connected
• The number of these scores (up to radius r) is O((3kc)r n)
• Details of the proof in the paper
• With a linear amount of information we can explore a ball of radius r containing O(nr)
solutions
x4 x3
x1 x2
x4 x3
x1 x2
S2(x) = f(1)
(x 2) f(1)
(x) + f(2)
(x 2) f(2)
(x) + f
S1,2(x) = f(1)
(x 1, 2) f(1)
(x)+f(2)
(x 1, 2) f(2)
(x)+f
S1,2(x) 6= S1(x) + S2(x)
l=1 l=1
S4(x) = f(x 4) f(x)
S1,4(x) = f(x 1, 4) f(x)
S1,4(x) = S1(x) + S4(x)
We need to store the scores of moves whose
variables form a connected subgraph of the VIG

Conclusions &
Future Work
Scores to update
• Let us assume that x4 is flipped
• Which scores do we need to update?
• Those that need to evaluate f(3) and f(4)
f(1) f(2) f(3) f(4)
x1 x2 x3 x4

Conclusions &
Future Work
Scores to update
f(1) f(2) f(3) f(4)
x1 x2 x3 x4

Conclusions &
Future Work
Scores to update
f(1) f(2) f(3) f(4)
x1 x2 x3 x4
x4 x3
x1 x2
•  The scores of moves containing variables adjacent
or equal to x4 in the VIG

Conclusions &
Future Work
Scores to update and time required
• The number of neighbors of a variable in the VIG is bounded by c k
• The number of stored scores in which a variable appears is the number of spanning
trees of size less than or equal to r with the variable at the root
• This number is constant
• The update of each score implies evaluating a constant number of functions that
depend on at most k variables (constant), so it requires constant time
x4 x3
x1 x2
O( b(k) (3kc)r |v| ) b(k) is a bound for the time to
evaluate any subfunction
f(1) f(2) f(3) f(4)
x1 x2 x3 x4

Conclusions &
Future Work
Definition
NKq-landscapes Sanity check Random model Next improvement
Why NKq and not NK?
Floating point precision
• An NK-landscape is a pseudo-Boolean optimization problem with objective function:
where each subfunction f(l) depends on variable xl and K
other variables
• There is polynomial time algorithm to solve the adjacent model (Wright et al., 2000)
• The subfunctions are randomly generated and the values are taken in the range [0,1]
• In NKq-landscapes the subfunctions take integer values in the range [0,q-1]
• We use NKq-landscapes in the experiments
f(1)
(x)+f(2)
(x 1, 2) f(2)
(x)+f(3)
(x 1, 2) f(3)
(x)
S1,2(x) 6= S1(x) + S2(x)
f(x) =
NX
l=1
f(l)
(x)
1
f = + + +f(1)(x) f(3)(x)f(2)(x) f(4)(x)
x1 x2 x3 x4
• In the random model these other
variables are random
• In the adjacent model the variables are
consecutive
f = + + +f(1)(x) f(3)(x)f(2)(x) f(4)(x)
x1 x2 x3 x4

Conclusions &
Future Work
Results: checking the constant time
• Sanity check: flip every variable the same number of times (120,000) and measure the
time and memory required by the score updates
NKq-landscapes
•  Adjacent model
•  N=1,000 to 12,000
•  K=1 to 4
•  q=2K+1
•  r=1 to 4
•  30 instances per conf.
K=3
r=1
r=2
r=3
r=4
0 2000 4000 6000 8000 10000 12000
0
5
10
15
20
N
TimeHsL

Conclusions &
Future Work
Results: checking the constant time
• Sanity check: flip every variable the same number of times (120,000) and measure the
time and memory required by the score updates
NKq-landscapes
•  N=1,000 to 12,000
•  K=1 to 4
•  q=2K+1
•  r=1 to 4
K=3
r=1
r=2
r=3
r=4
0 2000 4000 6000 8000 10000 12000
0
50000
100000
150000
N
Scoresstoredinmemory

Conclusions &
Future Work
Results: checking the time in the random model
• Random model: the number of subfunctions in which a variable appears, c, is not
bounded by a constant
NKq-landscapes
•  Random model
•  N=1,000 to 12,000
•  K=1 to 4
•  q=2K+1
•  r=1 to 4
K=3
r=1
r=2
r=3
0 2000 4000 6000 8000 10000 12000
0
50
100
150
200
n
TimeHsL

Conclusions &
Future Work
Results: checking the time in the random model
• Random model: the number of subfunctions in which a variable appears, c, is not
bounded by a constant
NKq-landscapes
•  Random model
•  N=1,000 to 12,000
•  K=1 to 4
•  q=2K+1
•  r=1 to 4
K=3
r=1
r=2
r=3
0 2000 4000 6000 8000 10000 12000
0
100000
200000
300000
400000
N
Scoresstoredinmemory

Conclusions &
Future Work
Next improvement algorithm
• Nearest moves are selected first (e,g, all r=1 moves before r=2)
ists (Sv > 0 for some v 2 Mr
), the algorithm selects one
of the improving moves t (line 6), updates the Scores using
Algorithm 1 (line 7) and changes the current solution by the
new one (line 8).
Algorithm 3 Hamming-ball next ascent.
1: best ?
2: while stop condition not met do
3: x randomSolution();
4: S computeScores(x);
5: while Sv > 0 for some v 2 Mr
do
6: t selectImprovingMove(S);
7: updateScores(S,x,t);
8: x x t;
9: end while
10: if best = ? or f(x) > f(best) then
11: best x;
12: end if
13: end while
Regarding the selection of the improving move, our ap-
proach in the experiments was to select always the one with
the lowest Hamming distance to the current solution, that
5.
In
with
used
tion
chan
resp
whe
K v
rand
rand
tion
indi
NKq
opti
rith
of s
c =
bou

Conclusions &
Future Work
Results for next improvement
The normalized distance to the optimum, nd, is:
nd(x) =
f⇤
f(x)
f⇤
, (11)
where f⇤
is the ﬁtness value of the global optimum, com-
puted using the algorithm by Wright et al. [10].
Figure 7: Normalized distance to the global opti-
mum for the Hamming-ball next ascent.
NKq-landscapes
•  N=10,000
•  K=1
•  q=2K+1
•  r=1 to 10
•  30 instances
• From r=6 to r=10 the global optimum is always found
• r=10 always found the global optimum in the first descent
r=7 always finds the global
optimum in less than 2.1 s

Conclusions &
Future Work
Conclusions and Future Work
Conclusions & Future Work
•  We can identify improving moves in a ball of radius r around a
solution in constant time (independent of n)
•  The space required to store the information (scores) is linear
in the size of the problem n
•  This information can be used to design efficient search
algorithms
Conclusions
•  Random restarts are costly, study the applicability of soft
restarts
•  Application to other pseudo-Boolean problems like MAX-kSAT
•  Include clever strategies to escape from local optima
Future Work

Acknowledgements
Efficient Identification of Improving Moves
in a Ball for Pseudo-Boolean Problems

Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems

Ähnlich wie Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems (20)

Mehr von jfrchicanog

Mehr von jfrchicanog (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems