The manuscript I wrote about my new crossover operator for Genetic Algorithms has been accepted as a full paper for the Genetic and Evolutionary Computation Conference 2017 (GECCO'17 - http://gecco-2017.sigevo.org/index.html/HomePage).
Here is the presentation I used for my GECCO'17 conference talk, held in Berlin on 19 July 2017.
Check out my conference proceeding on the ACM Digital Library @https://doi.org/10.1145/3071178.3071240
Check out an excerpt from my conference talk @https://www.youtube.com/watch?v=JOZuTsAHaAs
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
1. The Role of Crossover Operator in
Bayesian Network Structure Learning
Performance: a Comprehensive
Comparative Study
and New Insights
Carlo Contaldi
Fatemeh Vafaee
Peter C. Nelson
University of Illinois at Chicago
University of New South Wales
University of Illinois at Chicago
2. Bayesian Network Structure Learning:
one path of knowledge, many purposes
BNs represent causal relationships underlying phenomena
Unsupervised structure learning knowledge discovery
• Medical diagnosis [1],
• Pathway modeling [2],
• Environmental modeling and management,
• Information retrieval, …
2
[1] C. E. Kahn Jr et al. 1997. Construction of a Bayesian network for
mammographic diagnosis of breast cancer. Computers Biol. Med. 27, 1
[2] S. M. Hill et al. 2012. Bayesian inference of signaling network
topology in a cancer cell line. Bioinformatics. 28, 21
L
S
T
E
D
B
X
A
ASIA BN
3. • Constraint-Based (CB) – apply Conditional Independence tests
• Search & Score (S&S) – search driven by a scoring function 𝑓(𝑥)
In Genetic Algorithms (GAs) a population of individuals evolves
through generations
• Mutation
• Crossover
• Selection
A Hybrid method takes the best of both worlds
Bayesian Network Structure Learning
using Genetic Algorithms
3
ASIA SS
L
S
T
E
D
B
X
A
1 2 3 4 5 6 7 8 9 10
NP-hard: #structures grows super-exponentially with #nodes
Two heuristic approaches:
4. 1 2 3 4 5 6 7 8 9 10P
1 2 3 4 5 6 7 8 9 10O
1 2 3 4 5 6 7 8 9 10Q
Single-Point Crossover
The Crossover Operator
Crossover block: minimum element of information
extracted from an individual and used to constitute
the offspring
1. Extract crossover blocks from 2+ individuals
2. Use them to generate offspring 4
1 2 3 4 5 6 7 8 9 10P
1 2 3 4 5 6 7 8 9 10O
1 2 3 4 5 6 7 8 9 10Q
Uniform Crossover
L
S
T
E
D
B
X
A
5. 3 5 1 4 9 2 10 6 8 7P
1 2 3 4 5 6 7 8 9 10O
3 5 1 4 9 2 10 6 8 7Q
Shuffle Crossover
“In general, […] no useful recipe for the choice of a
crossover operator can be given a priori.” [3]
…still, you can satisfy the evaluator’s tastes.
1 2 3 4 5 6 7 8 9 10P
1 3 4 5 6 7 8 10O
1 2 3 4 5 6 7 8 9 10Q
Non-Geometric Crossover
2 9
:
:
:
5
The “Appropriate” Crossover Operator
5
[3] F. Vafaee. 2010. Controlling Genetic Operator Rates in Evolutionary
Algorithms. Ph.D. Dissertation. University of Illinois at Chicago.
L
S
T
E
D
B
X
A
6. The Scoring Function: a knowledge-driver
which S&S/Hybrid methods hinge on
Estimates the fitness of a possible solution structure to the data
Task complications: Multimodality, epistasis, little data available
Deceptive scores, not necessarily correlated with performance [4]
Exploit scoring mechanisms hold back deceptiveness
Generally decomposable [5]
BDeu: sum of independent subscores, one per node
6
[4] F. Vafaee. 2014. Learning the Structure of Large-scale Bayesian Networks
using Genetic Algorithm. In Proceedings of GECCO’14.
[5] A. M. Carvalho. 2009. Scoring functions for learning
Bayesian networks. Inesc-id Tec. Rep.
7. :
:
:
:
A
E
T
X
1 2 3 4 5 6 7 8 9 10P
1 2 3 4 5 6 7 8 9 10O
1 2 3 4 5 6 7 8 9 10Q
D
E E E
D
L
BT X
The Idea: let the Scoring Function
shape the Crossover Blocks
Epistasis disrupting the parent set
wastes the evolutionary efforts
Preserving scoring partition unleashes
the achievement of successful patterns
7
L
S
T
E
D
B
X
A
L
S
T
E
D
B
X
A
P Q
L
S
T
E
D
B
X
A
L
S
T
E
D
B
X
A
O
8. Parent Set Crossover and its driving skills
Exploitation: explore the neighborhood
of previously visited points
Walks across the space of parent sets
Crossover is useful when:
the degree of interactivity is zero [6]
parts of the evolving individual are quasi-independent [7]
8
[6] D. B. Fogel and J. W. Atmar. 1990. Comparing genetic operators with Gaussian
mutations in simulated evolutionary processes using linear systems. Biol. Cybern. 63, 2
[7] W. Hordijk and B. Manderick. 1995. The usefulness of
recombination. In European Conference on Artificial Life.
A
E
T
X
1 2 3 4 5 6 7 8 9 10P
1 2 3 4 5 6 7 8 9 10O
1 2 3 4 5 6 7 8 9 10Q
D
E E E
D
L
BT X
9. Sixteen Crossover Operators competing
in an extensive experimental framework
Embedded in GAs included in a Hybrid scheme
CB + Standard GA [Vafaee. 2014]
CB + DiG-SiRGA [Vafaee et al. 2014]
• Compared with non-evolutionary methods: Sparse Candidate (SC),
Ordering-Based Search (OBS), Max-Min Hill-Climbing (MMHC)
Synthetic datasets of various sizes sampled from ASIA (8 nodes),
INSURANCE (27), ALARM (37), HEPAR-II (70), WIN95PTS (76)
Performance metrics: F1 score, sensitivity, specificity, (Bayesian score)
Wilcoxon signed-rank test to validate results over 20 runs
Default set of parameters 9
12. 12
End-of-Execution Results: GA
INSURANCE 50 ALARM 70 HEPAR-II 100 WIN95PTS 100
F1 Bayes F1 Bayes F1 Bayes F1 Bayes
Parent Set X 0.35 -1037 0.61 -1021 0.18 -3557 0.41 -1233
Two-Point X 0.26 -1070 0.49 -1029 0.14 -3552 0.36 -1230
Half-Uniform X 0.32 -1037 0.56 -1023 0.15 -3553 0.37 -1226
FB Scanning X 0.33 -1032 0.55 -1022 0.15 -3553 0.38 -1228
SC 0.18 -1045 0.26 -1016 0.09 -3476 0.10 -1075
OBS 0.18 -1101 0.23 -1078
MMHC 0.40 -1003 0.51 -970 0.18 -3572 0.20 -1555
13. 13
End-of-Execution Results: DiG-SiRGA
INSURANCE 50 ALARM 70 HEPAR-II 100 WIN95PTS 100
F1 Bayes F1 Bayes F1 Bayes F1 Bayes
Parent Set X 0.43 -1012 0.65 -1012 0.19 -3553 0.37 -1246
Two-Point X 0.36 -1012 0.55 -1015 0.14 -3552 0.38 -1214
Half-Uniform X 0.35 -1016 0.59 -1012 0.15 -3552 0.36 -1240
FB Scanning X 0.35 -1023 0.59 -1012 0.15 -3552 0.32 -1261
SC 0.18 -1045 0.26 -1016 0.09 -3476 0.10 -1075
OBS 0.18 -1101 0.23 -1078
MMHC 0.40 -1003 0.51 -970 0.18 -3572 0.20 -1555
14. Proposed Parent Set Crossover for BN Structure Learning using GA
• Incorporates structural properties of the problem
• Reduces disruption action in favor of exploitation
Compared with state-of-the-art genetic and non-evolutionary methods
• In terms of various performance metrics
Convergence behavior and end-of-execution results
Statistically significant evaluation
Parent Set Crossover outperforms its competitors in the benchmark
• Classification and Bayesian scores are not correlated
• DiG-SiRGA performs better than GA
14
Final
Remarks
contaldicarlo@gmail.com
f.vafaee@unsw.edu.au
nelson@uic.edu
University of Illinois at Chicago
University of New South Wales
University of Illinois at Chicago
• Carlo Contaldi
• FatemehVafaee
• Peter C. Nelson