How to Remove Document Management Hurdles with X-Docs?
P-Systems for approximating NP-Complete optimization problems
1. Francesco Corucci – Bioinformatics course (Prof. R. Barbuti)
Percorso di Eccellenza, Laurea Magistrale in Ingegneria Informatica – Università di Pisa
From a work of Taishin Yasunobu Nishida (nishida@pu-toyama.ac.jp)
Faculty of Engineering, Toyama Prefetural University
2. NP-Complete problems
2
Problems for which no polynomial solution is known
Many examples of such problems, very often related to
practical applications (logistic, computer science,
biology, etc)
A common approach consist in addressing these
problems with sub-optimal approximation algorithms
that can be solved in polynomial time
P-systems can be usefull within this context
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
3. Outline of membrane algorithms3
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
4. Mm-1
M0
M1
Components of membrane algorithms
4
A membrane algorithm for approximating
an optimization problem consists of:
1. a certain number of regions, outlined
by nested membranes (labeled Mi)
2. in every region, a subalgorithm (si)
and a few tentative solutions
3. a solution transporting mechanism
between adjacent regions
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
tentative solutions
sm-1
s1
s0
subalgorithm
5. Membrane algorithm
5
A step of the membrane algorithm acts as
follows:
in every region, simultaneously, tentative
solutions are updated by the subalgorithm
placed in the same region
solutions transport mechanism: in every
region, the best solution (with respect to the
optimization criterion) is sent to the
adjacent inner region, the worst is sent to
the adjacent outer one
Mm-1
M0
M1
best from M2
W
B
B
W
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
6. Membrane algorithm
6
The membrane algorithm repeats updating and transporting
solutions until a termination condition is satisfied.
Possible termination conditions are:
Max number of iterations limit
The best solution is not changed in a predetermined
number of steps
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
7. Output definition
7
The innermost membrane (M0) is defined as the output
membrane of the algorithm
Its content at the end of the execution is the approximated
solution for the optimization problem
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
8. Subalgorithms
8
A membrane algorithm can use a number of different of
subalgorithms
A subalgorithm can be any approximate algorithm for
optimization problems. Examples are:
Genetic algorithms
Tabu search
Simulated annealing
…
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
9. Escaping from local minima
9
The membrane algorithm should be able to escape from local
minima (we are searching for a global one!)
For this reason, subalgorithms placed in the outer regions should
enhance random search (e.g. with random mutations)
In the innermost membrane, a subalgorithm enhancing the local
search should be used instead (e.g. search for neighboring), in
order to refine the good solutions selected
Assigning appropriate subalgorithms for a given problem is critic in order to
obtain good performances from the membrane algorithm
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
10. Consideration about parallelism
10
At every step, the subalgorithm execution in each region is
independent from the others
Very simple communication occurs at the end of the step,
between adjacent regions
The membrane algorithm could be easily implemented in
parallel, distributed, or grid computing systems
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
11. A practical example:
The Traveling Salesman Problem11
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
12. Traveling Salesman Problem
12
The problem: given a list of cities and their pairwise
distances, the task is to find the shortest route that visits
each city exactly once
The TSP has several practical applications:
planning, logistic
microchip manufacturing (“cities” are soldering points, the path
is the electronic track)
It has been showed that the TSP is a NP-complete problem
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
13. Details of the algorithm
13
Let m be the number of membranes (M0 is the innermost)
An istance of the TSP with n nodes consists n pairs of real
numbers vi =(xi,yi), i=0,1,…, n-1 (n points in a
bidimensional space)
Distance (Euclidean): d(vi,vj) = xi − x𝑗
2 + yi − y𝑗
2
A solution v = (v0, v1, …, vn-1) (order of visit) has value
(cost) W(v) = 𝑑(vi,vi+1)𝑛−2
𝑖=0 + 𝑑(vn−1,v0)
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
14. Details of the algorithm
14
Obviously, for two solutions u and
v, v is better than u if W(v) < W(u)
The solution which has the minimum
value among all possible solutions
is said to be the strict solution of a
TSP istance
One tentative solution in M0 and
two in all other regions
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Mm-1
M0
M1
15. Innermost region subalgorithm
15
We set tabu search as subalgorithm in the innermost region (M0):
this algorithm searches a neighbor of the tentative solution by
exchanging two adjacent nodes in the solution (local search, for
refining)
Tabu search resets the tentative solution if:
1. The value of the neighboring solution is less than that of the tentative
solution (the former becomes the new tentative solution)
2. The value of the best solution in region M1 is less than of the
tentative solution (the former becomes the new tentative solution)
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
16. Outer regions subalgorithm
16
The chosen subalgorithm for the outer regions (inspired by genetic
algorithms) is described as follows:
1. If the two solutions have the same value, a part of one solution
(selected probabilistically) is reversed (→ avoids duplicates);
2. Recombinate the two solutions producing two new solutions
(crossover: several methods are possible, EXX was used);
3. Modify the two solutions by point mutations (in the ith region, a
mutation occurs with probability
𝑖
𝑚
)
→ enhances random search, as requested
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
17. Overall algorithm
17
1. Consider an instance of the TSP
2. Randomly construct one tentative solution for region 0 and two tentative
solutions for every region from 1 to m-1
3. For k = 0 … d (‘d’ is a parameter)
1. Update: simultaneously update tentative solutions in each region using the
associated subalgorithm
2. Transport: for each region ‘i’, send the best solution to region i-1 (inner),
and the worst to region i+1 (outer) (region 0 and m-1 can move only one
solution). Remove all solutions but the best two.
4. Output the tentative solution in region 0 as the output of the algorithm
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
19. Experimental results
19
The algorithm was implemented
in Java programming language
and tested on a computer
Figure shows an execution
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
20. Comparison with simulated annealing
20
51 nodes, 40’000 iterations, 20
executions, variable number of
membranes (from 2 to 70) (eil51)
100 nodes, 100’000 iterations, 20
executions (kroA100)
Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Comparing the membrane algorithm with simulated annealing (SA) (a probabilistic algorithm
often used for TSP solving)
STRICT VALUE: 426
STRICT VALUE: 21’282
21. Comparison with simulated annealing
21 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
The membrane algorithm worked slightly better
than SA in the first test and slightly worse in the
second one
Since the differences are very small, we may
conclude that the membrane algorithm is as good as
the simulated annealing
22. Saturation effect
22 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
With more membranes we get a better
approximation…
…however, experimental results seem to point out
that in some cases the improvement achieved with
more membranes tends to saturate
Since the computation time is proportional to the
number of membranes, we need a trade-off
23. Fast convergence
23 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Figure shows the changes of
the average value of
solutions for kroA100
problem solved by a
membrane algorithm with 50
membranes as the number of
iterations increases
The algorithm converges
rather quickly to good
solutions
Convergence to good
solutions in about 2000-
3000 iterations
25. Improved membrane algorithms
25 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Is it possible to improve the performances of the
membrane algorithm incorporating the concepts of a
tissue P-system (→compound approach) or of P-systems
with dynamic membrane structure (→ shrink approach)
26. Compound membrane algorithm
26 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Tissue based, two phases:
1. First phase: a certain number
of membrane algorithms
produce good solutions from
randomly generated initial
tentative solutions
2. Second phase: the good
solutions produced by the
first phase are used as initial
ones for the second phase
27. Compound membrane algorithm
27 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Set up:
100 membrane algorithms in the first phase
Every algorithm uses 50 membranes
Each algorithm in the first phase terminates if the best solution does not improve
during 500 iteration
The membrane algorithm in the second phase terminates if the best solution
does not improve in 5000 iterations
28. Compound membrane algorithm
28 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
The table shows the results of
experimental tests:
We can see that the compound
membrane algorithm has significantly
improved performances compared to
previous approaches (it always
outputs almost strict solutions)
The computation time of compound membrane algorithm was obviously much longer than
that of the simple algorithm on a common computer
However, because the execution of the membrane algorithms in the first phase are
completely independent, they could be easily parallelized on a distributed architecture,
so that the computation time will be only twice related to the simple algorithm
STRICT: 426 STRICT: 21282
29. Compound membrane algorithm
29 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Possible explainations for the good performances of the
compound approach:
1. Large number of random initial solutions
2. The first phase selects «good seeds»
3. The second phase generates very good solutions by recombining the good seeds
obtained in the first phase
30. Shrink membrane algorithm
30 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
Based on dynamic membrane structure, it consists of three phases
1. Phase 1: a certain number of algorithms (with five membranes
and Genetic Algorithm based subalgorithms in all regions) are
executed. After termination condition…
2. Phase 2: «shrink» the systems to two membranes and refine with
tabu search in region 0 and GA type subalgorithms in region 1
3. Phase 3: pass the good seeds selected in the previous phases to a
second stage, like in the compound approach
31. Shrink membrane algorithm
31 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
We can see from the table
above that the shrink algorithm
beats the compound, also being
significantly faster
32. Conclusions
32 Bioinformatics course - Laurea Magistrale in Ingegneria Informatica, Percorso di Eccellenza
The experimental results prove the effectiveness of the membrane
approach for approximating NP-complete problems
We saw how the performances can be improved considering some
variants of P-systems (tissue based and with dynamic membranes)
There are many possibilities for further researches:
Using different subalgorithms
Using different dynamic structures
Using different terminating conditions
Introducing further P-systems ingredients