In this talk I will overview ten years of research in the application of evolutionary computation ideas in the natural sciences. The talk will take us on a tour that will cover problems in nanoscience, e.g. controlling self-‐organizing systems, optimizing scanning probe microscopy, etc., problems arising in bioinformatics, such as predicting protein structures and their features, to challenges emerging in systems and synthetic biology. Although the algorithmic solutions involved in these problems are different from each other, at their core, they retain Darwin’s wonderful insights. I will conclude the talk by giving a personal view on why EC has been so successful and where, in my mind, the future lies.
Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology
1. Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics, Systems & Synthetic Biology Prof. Natalio Krasnogor Automated Scheduling, Optimisation and Planning Research Group School of Computer Science, University of Nottingham www.cs.nott.ac.uk/~nxk twitter.com/NKrasnogor Page 1 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
2. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology & Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 2 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
3. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology & Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 3 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
4. Darwin’s Magic Page 4 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011 Thank you Youtube
5. Algorithmic Beauty Inheritable Instructions Set Limited Resources Imperfect Replication A Powerful Secondary Effect: Selection An awe inspiring product: Evolution by Natural Selection Page 5 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
6. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology & Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 6 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
7. Evolutionary Computation in the Natural Sciences Programmable algorithmic entry to the vast world of nanoscale physical, chemical & biological systems and processes Algorithmic and Artificial Living Matter (ALMA) A Research Vision How (?) do you gain algorithmic entry into Embedded behavior Information & Algorithms Complexity Robustness Tradeoffs Computer Science How does “The Logistics of Small Things” look like? Page 7 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
8. The Spatial Scales Involved Page 8 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
9. ALMA & The Logistics of Small Things How do you program complex nano/micro scale process : through billions of tiny & simple distributed programs/processors? when there is no clear distinction between hardware and software? when the wetware is not simply a stochastic program: when wetware is poorly characterised and is likely to evolve, etc. function f1(p1,p2,p3,p4) { if (p1<p2) and (rand<0.5) print p3 else print p4 } function f1(p1,p2,p3,p4) { if (p1<p2) RND print p3 RND else RND print p4 RND } function f1(p1,p2,p3,p4) { if (p1<p2) RND print p3 RND else RND print p4 RND } function f1(p1,p2,p3,p4) { if (p1<p2) RND incr p3 RND else RND decr p4 RND } Page 9 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
10. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 10 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
11. The Spatial Scales Involved Page 11 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
12. Molecular Tiles & Programmable Self-Assembly Algorithmic Self-Assembly of DNA Sierpinski Triangles. P.W.K. Rothemund, N. Papadakis, E. Winfree. PLoS Biology 2:12 (2004) Page 12 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
13.
14. Tiles with deterministic assembly (Model 1) Tiles with probabilistic assembly (Model 2) Page 14 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
15. Evolutionary Design Approach Variable length individuals (Genotype) Genotype -Phenotype Mapping Randomly created Wang tiles One-point crossover Phenotype Bitwise mutation Phenotype – Fitness Mapping Minkowski functionals (A, P, X) A = 12 P = 24 X = 0 A = 100 P = 40 X = 1 Population size = 100, Individuals length = [1,10], Generations = 300, Pcrossover= 0.7, PMutation= 0.1/0.05/0.01 Vs Page 15 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
16. Probabilistic Assembly + No Rotation Probabilistic Assembly + Rotation Deterministic Assembly + Rotation Deterministic Assembly + No Rotation Page 16 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
17. How Does Self-Assembly Gets Programmed? Two-tile self-assembly Three-tile self-assembly Four-tile self-assembly Five-tile self-assembly We calculated the equivalence classes of binding pockets defined by “bp1 R bp2 iif NAFE(bp1)=NAFE(bp2)” for the best tile set. We observed thatequivalence classes with NAFE smaller than T are highly likely to participate in the self-assembly process as these are more populous. More “assembable” binding pockets = Generalised Secondary Structures Page 17 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
20. Motion without interactionDiffusion across terraces on the substrate Intramolecule strength: energy between two no-functionalised porphyrins Molecule-substrate strength: energy of a porphyrin to the substrate Rotational strength: molecule-substrate strength for spinning Page 18 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
21. How Do You Image and Manipulate at This Scale? D. M. Eigler & E. K. Schweizer, Nature 344, 524 - 526 (1990) C60 Y. Sugimoto et al., Nature letters 446, 64 (2007). Hlaet al. Phys. Rev. Lett. 85, 2777–2780 (2000) D.L. Keeling et al. Phys. Rev. Lett 94, 146104 (2005) Page 19 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
22. Scanning tip Z A X Y Sample surface Axis under direct (piezo) control Even 3 Variable Problems are Difficult: Optimising a Scanning Probe Microscopy it ∝ exp(−2kd) i G http://www2.fz-juelich.de/ibn/index.php?index=1021 V The tunnel current it is highly dependant on the tip-sample distance, d. This current can be maintained with a feedback loop, G, that actively controls the tip-sample distance. Page 20 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
23. Understanding the image J. H. A. Hagelaaret al. PRB 78, 161405R 2008 L.Gross et al. Science 325 1110 (2009) Page 21 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
24. (Un)Stable and (Un)defined Tip States Imaging problems, spontaneous tip changes Page 22 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
25. Two Stage Automation Process Automated probe microscopy via evolutionary optimisation at the atomic scale. R. Woolley, J. Sterling, A. Radocea, N. Krasnogor and P. Moriarty. Applied Physical Letters (to appear) Cellular GA with Smart Initialisation In-situ Ex-situ Voltage pulsing (deliberate crash) Fine tuning (changing scan parameters) Page 23 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
26. Stage 1: Smart Initialisation (coarsely) Conditions the Probe Streaky Image. Executing cleaning pulse A deterministic approach Cloudy Image. Executing cleaning pulse Flat Surface. Zooming in to 50nm Flat Surface. Zooming in to 20nm Constant Atomic resolution. Zooming in to 4nm Poor Atomic resolution. Rescanning Consistent fair atomic resolution. Stage 1 complete. Time elapsed: 1010.1902 (~17mins) Page 24 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
27. G V i G G G V V V i i i Stage 2: Fine adjustment with CGA Starting image Cellular GA G V i Machine Optimised Page 25 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
28.
29.
30. How Does it Compares to an Expert Operator? Page 27 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
31. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology & Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 28 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
32. The Spatial Scales Involved Page 29 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
33. Protein Folding & Structure Prediction Anfinsen’s thermodynamic hypothesis [Anfinsen 1973, Dill and Chan 1997] Primary Sequence 3D Structure Protein Structure Prediction (PSP) aims to predict the 3D structure of a protein based on its primary sequence (perhaps disregarding the folding process) Page 30 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
34. Defining and Predicting Useful Features M. Stout, J. Bacardit, J. Hirst & N. Krasnogor, Bioinformatics 2008 24(7):916-923. Contact M. Stout, J. Bacardit, J.D. Hirst, R.E Smith, and N. Krasnogor. Prediction of topological contacts in proteins using learning classifier systems. Journal Soft Computing - A Fusion of Foundations, Methodologies and Applications, 13(3):245-258, 2008. Page 31 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
38. Coordination NumberIntegration of all these predictions plus other sources of information Final CM prediction (using BioHEL) Using BioHEL Page 32 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
39. The BioHEL GBML System BIOinformatics-oriented Hiearchical Evolutionary Learning – BioHEL(Bacardit & Krasnogor, 2009) BioHEL is a rule-based evolutionary learning system that employs the Iterative Rule Learning (IRL) paradigm First used in EC in Venturini’s SIA system (Venturini, 1993) Widely used for both Fuzzy and non-fuzzy evolutionary learning J. Bacardit, M. Stout, J.D. Hirst, K. Sastry, X. Llora, and N. Krasnogor. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. Proceedings of the 2007 Genetic and Evolutionary Computation Conference, ACM Press, 2007. J. Bacardit, M. Stout, J.D. Hirst, A. Valencia, R.E. Smith, and N. Krasnogor. Automated alphabet reduction for protein datasets. BMC Bioinformatics, 10(6), 2009. Bronze Medal in the THE 2007 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION. Page 33 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
40.
41. We predict them from the closest neighbours in the chainRi SSi Ri-1 SSi-1 Ri+2 SSi+2 Ri-2 SSi-2 Ri+3 SSi+3 Ri+4 SSi+4 Ri-3 SSi-3 Ri-4 SSi-4 Ri-5 SSi-5 Ri+5 SSi+5 Ri+1 SSi+1 Ri-1 Ri Ri+1 SSi Ri Ri+1 Ri+2 SSi+1 Ri+1 Ri+2 Ri+3 SSi+2 Page 34 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
51. Whole training process takes about 289 CPU days (~5.5h/rule set)x50 Samples x25 Rule sets Consensus Predictions Page 36 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
55. 140 server groupsPage 37 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
56. Contact Map prediction in CASP 7 Accuracy for groups that predicted a common subset of targets Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209 Page 38 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
57. Xd results Contact Map prediction in CASP 7 Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209 Page 39 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
58.
59. 67% accuracyEzkudia et al. Proteins 2009; 77(Suppl 9):196-209 Page 40 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
60.
61. Set of 3262 proteins for training all the 1D predictors
62. A subset of 2413 proteins used for CM prediction
67. 25K CPU hours were employed just to train the CM ensemblePage 41 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
68. In terms of performance These two groups derived contact predictions from 3D models Page 42 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
73. 10 000 CPU days for 10μs of folding[Dill and Chan 1997] P. Widera, J.M. Garibaldi, J., and N. Krasnogor,. Evolutionary design of the energy function for protein structure prediction, Proceedings of the IEEE Congress on Evolutionary Computation 2009. P. Widera, J. Garibaldi, and N. Krasnogor. GP challenge: evolving the energy function for protein structure prediction. Journal of Genetic Programming and Evolvable Machines, 11:61-88, 1 2010. Gold Medal in the THE 2010 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION Page 43 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
88. total of 150 CPU daysPage 48 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
89. Outline Darwin’s Magic and Algorithmic Beauty Evolutionary Computation in the Natural Sciences Self-Assembly and Scanning Probe Microscopy Optimisation Structural Bioinformatics Systems Biology & Synthetic Biology On Invariants, Decorations and the Future Conclusions Page 49 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
90. The Spatial Scales Involved Page 50 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
91. The Cell as an Information Processing Device LeDuc et al. Towards an in vivo biologically inspired nanofactory. Nature (2007) Page 51 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
92. Transcription Networks Environment Signal2 Signal5 Signal1 Signal3 Signal4 Signaln ... Transcription Factors Genome Gene1 Gene2 Gene3 Genek Page 52 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
93. Network Motifs: Evolution’s Preferred Circuits Biological networks are complex and vast Moreover, these patterns are organised in non-trivial/non-random hierarchies “Patterns that occur in the real network significantly more often than in randomized networks are called network motifs” Shai S. Shen-Orr et al., Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics 31, 64 - 68 (2002) RaduDobrin et al., Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics. 2004; 5: 10. The C1-FFL is a ‘sign-sensitive delay’ element and a persistence detector. Each network motif carries out a specific information-processing function The I1-FFL is a pulse generator and response accelerator Page 53 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
95. Nested EA for Model Synthesis F. Romero-Campero, H.Cao, M. Camara, and N. Krasnogor. Structure and parameter estimation for cell systems biology models. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2008), pages 331-338. ACM Publisher, 2008. Best Paper Award H. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009 Page 55 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
96.
97. Different time series have very different profiles, e.g., response time or maxima occur at different times/places
100. Sometimes the time series is qualitative or microarray dataH. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009 Page 56 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
102. Target Page 58 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
103. A Signal Translatorfor Pattern Formation FP2 FP1 act1 act2 rep1 rep2 rep3 rep4 I2 I1 Pact1 Prep3 Prep2 Pact1 Prep1 Pact2 Prep2 Prep4 Prep1 Pact2 Page 59 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
104. Uniform Spatial Distribution of Signal Translators for Pattern Formation pBR322 pACYC184 E. coli DH5α ∆sdiA/∆lacI (2∆) Page 60 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
105. Pattern Formation in synthetic bacterial colonies Page 61 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
106. pAYCP (1-3) pBR322 (4-6) Starting OD=10 2∆ DH5α Magnification: 100X Page 62 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
107. pUC6S (1-6) Starting OD= 10 Magnification: 40X 2∆ DH5α Page 63 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
108.
109. Algorithms are Tiny Factoring: Let n be the number to be factored. 1. Let Δ be a negative integer with Δ = -dn where d is a multiplier and Δ is the negative discriminant of some quadratic form. 2. Take the t first primes , for some . 3. Let fq be a random prime form of GΔ with . 4. Find a generating set X of GΔ 5. Collect a sequence of relations between set X and {fq : q ∈ PΔ} satisfying: 6. Construct an ambiguous form (a, b, c) which is an element f ∈ GΔ of order dividing 2 to obtain a coprime factorization of the largest odd divisor of Δ in which Δ = -4a.c or a(a - 4c) or (b - 2a).(b + 2a) 7. If the ambiguous form provides a factorization of n then stop, otherwise find another ambiguous form until the factorization of n is found. In order to prevent that useless ambiguous forms are generated, build up the 2-Sylow group S2(Δ) of G(Δ). Calculating Pi Page 65 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
110.
111. They are not short pieces of code, but large systemsPage 66 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
112. What are Evolutionary Algorithms? Research Paradigms for Problem Solving T.S. Kuhn. The Structure of Scientific Revolutions, 1962. Design Patterns and Pattern Languages C. Alexander, S. Ishikawa, M. Silverstein, M. Jacobson, I. Fiksdahl-King, S. Angel, S.: A Pattern Language - Towns, Buildings, Construction. Oxford University Press (1977) N. Krasnogor and J.E. Smith.IEEE Transactions on Evolutionary Computation, 9(5):474- 488, 2005. Page 67 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
113. Invariants and Decorations A Compact “Memetic” Algorithm by Merz (2003) Page 68 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
114. Invariants and Decorations A “Memetic” Particles Swarm Optimisation by Petalas et al (2007) Page 69 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
115. Invariants and Decorations A “Memetic” Artificial Immune System by Yanga et al (2008) Page 70 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
116. Invariants and Decorations A “Memetic” Learning Classifier System by Bacardit & Krasnogor (2009) Page 71 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
117. Invariants and Decorations Many others based on Ant Colony Optimisation, NN, Tabu Search, SA, DE, etc. Key Invariants: Global search mode Local search mode Many Decorations, e.g.: Crossover/Mutations (EAs based MAs) Pheromones updates (ACO based MAs) Clonal selection/Hypermutations (AIS based MAs) etc Page 72 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
118. A Pattern Language for Memetic AlgorithmsMemetic Algorithms by N. Krasnogor. Handbook of Natural Computation (chapter) in Natural Computing. Springer Berlin / Heidelberg, 2009. www.cs.nott.ac.uk/~nxk/publications.html Page 73 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
119. solving 1 problem – single instances Solving multiple unrelated problem – several classes instances Programming Programming solving 1 problem – several instances (self) adaptive Programming Solving a few problem – several classes instances (self) adaptive Self-generating Programming (self) adaptive Self-generating Self-Engineering Reuse Reuse Feedback Reuse Feedback A General Trend: moving away from close-loop optimisation towards open-ended and embodied optimisation Effort (e.g. Time, $$$, etc) Effort (e.g. Time, $$$, etc) Effort (e.g. Time, $$$, etc) Effort (e.g. Time, $$$, etc) Page 74 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
120. The Future of EAsSoftware Nurseries Fundamental Change of Temporal Scales Rethink Software will be “seeded” and grown, very much like a plant or animal (including humans) Software will start in an “embryonic” state and develop when situated on a production environment What would a software “incubation” machine look like? What would a software “nursery” look like? Page 75 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
121. DNA/RNA Cells Individual Organs Tissue Specialised Function Potential To Develop into multiple different types of cells Ultimate Solver Commitment Page 76 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
122. Production Environment Input SC SC SC SC SC SC Software Cell TSP Organ Euclidean TSP Organ GraphicalTSP Organ TSP Solver Software Organism Pluripotential Solver “DNA” Page 77 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
123. TSP Solver Software Organism Protein Structure Prediction Solver Software Organism Vehicle Routing Solver Software Organism Graph Isomorphism Solver Software Organism SAT Solver Software Organism Bin Packing Solver Software Organism Graph Coloring Solver Software Organism Network Interdiction Solver Software Organism Quadratic Assignment Solver Software Organism An Ecosystem of solvers Page 78 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
124. As, e.g., Biologists & Physicists have done through an ubiquitous, worldwide spanning informatics infrastructure, we should be focusing on building an online worldwide computational problem solving infrastructure Page 79 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
125.
126. Conclusions New types of executable structures In Nanotechnology DNA tiles, DNA origami, etc Non DNA based tiles Some have very definite programmable features Others require the program to be “distributed” and exploit noise and randomness In Synthetic Biology How to orchestrate activities at multiple temporal-spatial-energetic scales? How to cope with noise in the background that execures a program and in the program itself?! How to cope for programs that will evolve? Page 81 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
127. Conclusions New types of benchmarks Structural Biology (PSP and GP4PSP) Many of these problems can be modelled both as regression or classification problems Low/high number of classes Balanced/unbalanced classes Adjustable number of attributes Ideal benchmarks !! Scanning Probe Microscopy: Even a few dimensions are hard “Chameleons” as it is sampled http://www.infobiotic.net/ Page 82 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
128.
129. Requires strong links with data mining, ALIFE and, of course, AI (beyond existing trends in constraint satisfaction), search based software engineering (beyond current trends on testing/debugging)
130. Requires on-line, computer friendly ontologies of code (e.g the pattern language in the left), self-describing source code, protocols for autonomic code reuse, etcPage 83 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
131. Conclusions Learn From Physics, Chemistry & Biology The Invariants & Patterns,the Decorations are superfluous Evolution Self-Assembly & Self-Organisation Developmental systems Depend on a core genome coding for essential functionality Epigenomicscanalises development Hierarchical control systems that modify programs including susceptibility to horizontal gene (program libraries) transfer Infrastructure Missing Components Missing Components Page 84 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
141. Colleagues at ASAPJ. Chaplin J. Blakes E. Glaab M. Franco Page 85 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
142. Page 86 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Hinweis der Redaktion
The Mother of all Reverse Engineering StoriesDarwin (and Wallace) got to the core of the issue by clearly separating the important invariants from the accidental decorations
Embedded behaviour (sensors, processors, actuators, simbolic carriers, universal architectures)Information and Algorithms (combinatorial specification, data structures, programming languages, compilers)Complexity (moore laws, abstration and hierarchies)Robustness (Digital signal restoration, fault tolerance & error correction, standardised interfaces, protocols, composition, rigorous proofs)Tradeoffs: optimised performance VS scalability VS cost VS designibility
AFM Images of DAO-E Crystals(A) A large templated crystal in a 5-tile reaction (no R-11). A single ‘1’ in the input row (asterisk) initiates a Sierpinski triangle, which subsequently devolves due to errors. Mismatch errors within ‘0’ domains initiate isolated Sierpinski patterns terminated by additional errors at their corners.(B) A large untemplated fragment in a 5- tile reaction (no S-11). Large triangles of ‘0’s can be seen. Crystals similar to this are also seen in samples lacking the nucleating structure.(C) Several large crystals in a 6-tile reaction, some with more zeros than ones, some with more ones than zeros. It is difficult to determine whether these crystals are templated or not.(D) An average of several scans of the boxed region from (C), containing roughly 1,000 tiles and 45 errors. (E) An average of several scans of a Sierpinski triangle that initiated by a single error in a sea of zeros and terminated by three further errors (a 1% error rate for the 400 tiles here). Red crosses in (D) and (E) indicate tiles that have been identified (by eye) to be incorrect with respect to the two tiles from which they receive their input. Scale bars are 100 nm.DOI: 10.1371/journal.pbio.0020424.g006
Lesson 1:Evolution can work around stochasticity and noise. Not only that, these ENHANCES evolution! And this has been found to be true in living systems, namely, noiseand stochasticity provide robustness and tactical maneuvering (i.e. populations do not comit deterministically to a course of action hence they hedge their bets).Lesson 2:These results were robust with a large range of Glue strength matrices! That is, evolution was able to find which building blocks were useful, i.e., you do not need a specific ideal starting condition
Lesson 3: Thus evolution “tunes” the degree of cooperativity (i.e. NAFE) and how many of these to use.Lesson 4: GSS posed at the “edge” of the freezing threshold can build stuff but also correct errors!
Porphyrin (NO2) molecules on Au(110) surfaceMolecular structures along the step edges of Au(110) Close-packed islands and one dimensional structuresTwo Optimisation Problems:[1] optimal imaging[2] reverse engineering simulation parameters from iamges
Two residues of a chain are said to be in contact if their distance is less than a certain thresholdThe contacts of a protein can be represented by a binary matrix. 1 = contact 0 = non contactPlotting this matrix reveals many characteristics from the protein structureCM prediction is used in many 3D PSP methods (e.g. I-Tasser)We model a protein as a series of nested layers, assigning each residue to a different layerStrictly speaking each layer is a convex hull of pointsThe convex hull of a point set is simple and fast to compute & parameter-lessRecursive Convex Hull is computed by iteratively identifying the layers (hulls) of a proteinRemove edges from DT if a sphere drawn between two vertices contains another vertexGabriel Graph (GG)Remove edges from GG if an spherical lune contains another vertex Relative Neighbourhood Graph (RNG)
BIOinformatics-oriented Hiearchical Evolutionary Learning – BioHEL (Bacardit et al., 2009)BioHEL is a rule-based evolutionary learning system that employs the Iterative Rule Learning (IRL) paradigmFirst used in EC in Venturini’s SIA system (Venturini, 1993)Widely used for both Fuzzy and non-fuzzy evolutionary learningBioHEL inherits most of its components from GAssist [Bacardit, 04], a Pittsburgh evolutionary learning system
We selected a set of 2811 protein chains from PDB-REPRDB with:A resolution less than 2ÅLess than 30% sequence identifyWithout chain breaks nor non-standard residues90% of this set was used for training (~490000 residues)10% for test All three features were predicted based on a window of ±4 residues around the targetEvolutionary information (as a Position-Specific Scoring Matrix) is the basis of this local informationEach residue characterised by a vector of 180 valuesThe domain for all three features was partitioned into 5 states
Contact Map is assessed using the 11 CASP targets in the Free Modelling category Also, only long-range contacts (with a minimum chain separation of 24 residues) are evaluatedPredictor groups are asked to submit a list of predicted contacts and a confidence level for each predictionThe assessors then rank the predictions for each protein and take a look at the top L/x ones, where L is the length of the protein and x={5,10}From these L/x top ranked contacts two measures are computedAccuracy: TP/(TP+FP)Xd: difference between the distribution of predicted distances and a random distribution22 groups participated in casp8, but not all of them sent enough predictions for L/10 or L/5
Basic goal: this help highlighting gaps in understandingIntermediate goal: a detailed model would allow for the verification that their understanding is consistent with the available evidenceAdvanced goal: once you have succesfully done the two above what you really want is to be able to use your models to go beyond what you currently know bothTheoretically and biologically by conduction “in silicon biology” thus saving time, money, ethical considerations (as you can kill as many virtual mice as you want) and allowing you to have unprecendented control on the experimental (virtual) conditions. That is, in silicon you can do “what if?” testing. Predictive modelling might allow you to uncover unsuspected interactions and effects between model components, which perhaps are difficult to obtain by other routes.Dream goal of Synthetic Biology: to combinatorially combine in silico well-understood components/models for the design and generation of novel experiments and hypothesis and ultimatelyto design, program, optimise & control (new) biological systems to compile the design into biological matter.
To understand their functionality in a scalable way one must choose the correct abstractionCellular functions arise from orchestrated interactions between motifs consisting of many molecular interacting species.
A P System model is a set of rules representing molecular interactions motifs that appear in many cellular systems.The main idea is to use a nested evolutionary algorithm where the first layer evolves model structures while the inner layer acts as a local search for the parameters of the model. It uses stochastic P systems as a computational, modular and discrete-stochastic modelling framework. It adopts an incremental methodology, namely starting from very simple P system modules specifying basic molecular interactions, more complicated modules are produced to model more complex molecular systems.Successfully validated evolved models can then be added to the models library
Key missing link in all work in SB: in silico simulations do not take into account in any realistic way evolutionary activity!No accounting of evolutionThe ultimate automated programming challenge
The are a Research Paradigm as the provide a framework from where to ask and answer research questions
There is an obsession with algorithms, but what about systems?!?