General Tips for participating Kaggle Competitions
Gptp 2014 way of the combinator
1. THE WAY OFTHE
COMBINATOR
Bill Worzel billwzel@gmail.com
Evolution Enterprises http://evolver.biz
GPTheory and Practice 08 May 2014
Ann Arbor, MI
2. THE SKGP
• 15 years ago we developed what was then a novel approach
to GP using combinators
• Strongly typed, efficient, powerful, reusable code
• Algorithm can become superlinear in parallel application
because of code reuse
3. COMBINATORS
• Applicative algebra, derived from Lambda calculus,
binds left-to-right
• Sxyz -> xz(yz)
• Kxy -> x
• Ix -> x
• Bxyz -> x(yz)
• Cxyz -> xzy
• Yx -> x(Yx)Combinators Functions Quickly
4. VARIABLE ABSTRACTION
• D.A.Turner showed that removing bound variables using
combinators could produce an efficient computing system
(Turner 1979, A New Implementation Technique for Applicative
Languages, Software–Practice and Experience, vol 9, 31-49 )
• Essentially this used the fact we can create expressions that
are variable free using combinators to create a highly efficient
computer system (Clarke 1980)
5. THE SKGP
• Implements programs as graphs using combinators with
GP to produce pure functional (combinator) expressions
• Uses strong typing similar to (Yu 1997, 1998)
6. EVALUATING COMBINATOR
EXPRESSION
Example:
‘S(S(K +)(K 1))I’
is a curried
function that
adds 1 to what
it is applied to
so S(S(K +)(K
1)I applied to 3
is:
S(S(K +)(K 1))I 3
S(K +)(K 1)3(I 3)
K+3((K 1)3)(I 3)
+K 1 3 (I 3)
+ 1 (I 3)
+ 1 3
4
7. COMBINATORS FUNCTIONS
QUICKLY BECOME COMPLEX
Here is the function for factorial:
def fac = S(S(S(K cond)(S(S(K =)(K 0)))I))(K 1))(S(S(K *)I)
(S(K fac)(S(S(K -)I)(K 1))))
Evaluation is left as an “exercise to the reader.”
9. TYPING GP
• The SKGP is strongly typed so that it is always “type coherent”
• Based on Hindley/Milner type system as described in (Yu
1997) but for combinators instead of lambda expressions
• Type is checked during graph creation and resolved at time of
mutation and crossover - static typing
• If cannot resolve type, back out and try again by creating a
new subtree
• Strongly typed system will always terminate with same type
(halting problem?)
10. ESCAPINGTHE BOTTLE
• (Daida 2003) describes limitation
of standard GP in how trees grow
• Presents evidence that GP can be
limited in its search ability without
structure altering operators
• Combinators have the property
of being ‘structure altering
operators’
Daida, unpublished based on Daida 2004
Demonstrating Constraints to Diversity with a
Tunably Difficulty Problem for Genetic Programming
11. CHURCH-ROSSERTHEOREM
• The Church-RosserTheorem says pure function evaluation
can be order independent: Regardless of order of evaluation,
result will be the same
• Because of this, each functional piece, when evaluated, can be
stored for re-use since order of evaluation does not matter
• Normally this is inefficient since some pieces are not used
• Because GP shares pieces across generations, reuse gives
super-linear speed up: you don’t have to recompute each
component
12. REAL WORLD GP USING SKGP
• (Briggs 2006) shows such a system benchmarks well
(regression, parity, stack & queue evolution)
• What about real world problems?
• Process control program for manufacturing process
• Modeling chemical kinetics for NASA
• Bladder cancer analysis differentiating nodal metastatic
cancer from non-metastatic cancer
• Colon cancer prognostic
13. FUTURE DIRECTIONS
• Reuse of combinator
expressions across
generations via caching
mechanism
• Solving the ‘Y problem’
• Adding chromosome
structure