SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Algorithmic Information Theory and
       Computational Biology

                Hector Zenil

        Unit of Computational Medicine
              Karolinska Institutet
                    Sweden




              Hector Zenil   AIT Tools for Biology and Medicine
Complex Adaptive Systems (CAS)




                  Hector Zenil   AIT Tools for Biology and Medicine
Complexity is hard to quantify in biology

  Mapping quantitative stimuli to qualitative behaviour




                          Hector Zenil   AIT Tools for Biology and Medicine
Information Theory in Biology




      Sequence alignment
      Pattern recognition
      Sequence logos
      Binding site detection
      Motif detection
      Consensus sequences
      Biological significance


             [based on Claude Shannon’s Information Theory, 1940]
                         Hector Zenil   AIT Tools for Biology and Medicine
Algorithmic Information Theory

                  Which sequence looks more random?
       (a) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
          (b) AGGTCGTGAAGTGCGATGGCCTTACGTAGC
            (c) GCGCGCGCGCGCGCGCGCGCGCGCGCGC
        Classical probability theory vs. Kolmogorov Complexity

  Definition
                    KU (s) = min{|p|, U(p) = s}                              (1)

  Compressibility
  A sequence with low Kolmogorov complexity is c-compressible if
  |p| + c = |s|. A sequence is random if K (s) ≈ |s|.

                                  [Kolmogorov (1965); Chaitin (1966)]
                         Hector Zenil   AIT Tools for Biology and Medicine
Examples

  Example 1
  Sequences like (a) have low algorithmic complexity because they
  allow a short description. For example, “20 times A”. No matter
  how long (a) grows in length, the description increases only by
  about log2 (k) (k times A).


  Example 2
  The sequence (b) is algorithmic random because it doesn’t seem to
  allow a (much) shorter description other than the length of (b)
  itself.

  For example, for sequence (a), a proof of non-randomness implies
  the exhibition of a short program. Compressibility is therefore a
  sufficient test of non-randomness.

                         Hector Zenil   AIT Tools for Biology and Medicine
Example of an evaluation of K


  The sequence (b) GCGCGC...GC is not algorithmic random (or has
  low K complexity) because it can be produced by the following
  program (take G=0 and C=1):

  Program A(i):
  1: n:= 0
  2: Print n mod 2
  3: n:= n+1
  4: If n=i Goto 6
  5: Goto 2
  6: End

  The length of A (in bits) is an upper bound of K (GCGCGC ...GC ).



                         Hector Zenil   AIT Tools for Biology and Medicine
The ultimate measure of pattern detection and optimal
prediction

      Kolmogorov and Chaitin, Schnorr, and Martin-L¨fo
      independently provided 3 different approaches to randomness
      (compression, predictability and typicality).
      They proved (for infinite sequences):
           incompressibility ⇐⇒ unpredictability ⇐⇒ typicality

  When this happens in mathematics a concept has objectively been
  captured (randomness).
  This is why prediction in biology is hard. AIT tells that no effective
  statistical test will succeed to recognise all patterns and no
  computable technique can fully predict all outcomes. The problem
  is deeply connected to computability and algorithmic information
  theory.
            [Solomonoff (1964); Kolmogorov (1965); Chaitin (1969)]
                          Hector Zenil   AIT Tools for Biology and Medicine
Information distances and similarity metrics


  Measures waiting to be introduced in bioinformatics
      Information Distance ID(x, y ) = max K (x|y ), K (y |x)
      Universal Similarity Metric
      USM(x, y ) = max K (x|y ), K (y |x)/ max K (x), K (y )
      Normalised Information Distance:
      NCD(x, y ) = K (xy ) − min K (x), K (y )/ max K (x), K (y ) and
      NCD.
      Normalized Compression Measure (NCM): NC (s) = K (s)/|s|
      (asymptotic behaviour)
      Bennett’s Logical Depth:
      LDd (s) = min{t(p) : (|p| − |p ∗ | < d) and (U(p) = s)}
      (e.g. of an app. see Zenil, Complexity 2011)


                          Hector Zenil   AIT Tools for Biology and Medicine
Non-systematic but succesful attempts in biology
      GenCompress is a compression algorithm to compress DNA
      sequences: d(x, y ) = 1 − (K (x) − K (x|y ))/K (xy )




      NCD applied to genetic similarity:




  AIT looks at the genome as information, not as data (letters).
  Counting: traditional Shannon-entropy style sequencing.
  Interpreting: AIT. The full power of the theory hasn’t yet been
  unleashed.
                          Hector Zenil   AIT Tools for Biology and Medicine
To be or not to be...

  Borel’s “Infinite Monkey” theorem




                                                  Input

                                                                     1
                                  0




                                         1024                                  π
                   Syntax error


              √2
                                                                                   ∞
                                                                                                     CH3
          ∞
                                                  “To be or not
                                                to be, that is the
                                                    question.”




                                      Hector Zenil              AIT Tools for Biology and Medicine
Algorithmic probability




                     Hector Zenil   AIT Tools for Biology and Medicine
Producing π

  This C-language code produces the first 1000 digits of π (Gjerrit
  Meinsma):

  long k = 4e3, p, a[337], q, t = 1e3;
  main(j){for (; a[j = q = 0]+ = 2, k; )
  for (p = 1 + 2 ∗ k; j < 337; q = a[j] ∗ k + q%p ∗ t, a[j + +] = q/p)
  k! = j > 2? : printf (“%.3d”, a[j2]%t + q/p/t); }



  Producing non-random sequences:
  If an object has low Kolmogorov complexity then it has a short description
  and a greater probability to be produced by a random program. The less
  random a string the more likely to be produced by a short program.




                            Hector Zenil   AIT Tools for Biology and Medicine
Biological Big Data Analysis

  The information bottleneck:




  Small Data matters: Local measurements of information content
      are a good indication of the global information content of an
  object. Evidence: BDM Image classification. Compression works at
   large scales looking for long regularities, while BDM is very local.
      Yet both yield astonishing similar results for this object sizes.


                          Hector Zenil   AIT Tools for Biology and Medicine
Complementary methods for different sequence lengths
  The methods to approximate K coexist and complement each
  other for different sequence lengths.

                            short strings long strings scalability
                             < 100 bits > 100 bits
     Lossless compression
                                                           √                     √
            method                      ×
       Coding Theorem
                                        √
            method                                         ×                     ×
     Block Decomposition
                                        √                  √                     √
            method



          [Zenil, Soler, Delahaye, Gauvrit, Two-Dimensional Kolmogorov
           Complexity and Validation of the Coding Theorem Method by
                                                 Compressibility (2012)]
                         Hector Zenil       AIT Tools for Biology and Medicine
Coding Theorem method and lossless compression

  The transition between one method and the other. What is complex for
  the Coding Theorem method is less compressible.




     [Soler, Zenil, Delahaye, Gauvrit, Correspondence and Independence of
       Numerical Evaluations of Algorithmic Information Measures (2012)]

                           Hector Zenil   AIT Tools for Biology and Medicine
Online Algorithmic Complexity Calculator




      Provides: Shannon’s entropy, lossless compression (Deflate) values,
      Kolmogorov complexity approximations and relative frequency order
      (algorithmic probability).
      A Mathematica API and an R module.
      Datasets available online at the Dataverse Network.
      Basic data analysis tool for shorts sequence comparison.

                             [http://www.complexitycalculator.com]

                           Hector Zenil   AIT Tools for Biology and Medicine
Online Algorithmic Complexity Calculator 2




                      [http://www.complexitycalculator.com]


                    Hector Zenil   AIT Tools for Biology and Medicine
Simulation of natural systems w/complex symbolic systems

  An elementary cellular automaton (ECA) is defined by a local
  function f : {0, 1}3 → {0, 1},




  f maps the state of a cell and its two immediate neighbours (range
   = 1) to a new cell state: ft : r−1 , r0 , r+1 → r0 . Cells are updated
         synchronously according to f over all cells in a row.

                                                               [Wolfram, (1994)]

                           Hector Zenil   AIT Tools for Biology and Medicine
Behavioural classes of CA



  Wolfram’s classes of behaviour:

      Class I: Systems evolve into a stable state.
      Class II: Systems evolve in a periodic (e.g. fractal) state.
      Class III: Systems evolve into random-looking states.
      Class IV: Systems evolve into localised complex structures.
      e.g. Rule 110 or the Game of Life.




                                                              [Wolfram, (1994)]

                          Hector Zenil   AIT Tools for Biology and Medicine
Block Decomposition method (BDM)
  The Block Decomposition method uses the Coding Theorem
  method. Formally, we will say that an object c has complexity:


  K logm,2Dd×d (c) =                    (nu − 1) log2 (Km,2D (ru )) + Km,2D (ru )
                       (ru ,nu )∈cd×d
                                                                      (2)
  where cd×d represents the set with elements (ru , nu ), obtained
  from decomposing the object into blocks of d × d with boundary
  conditions. In each (ru , nu ) pair, ru is one of such squares and nu
  its multiplicity.




        [H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, (2012)]
                             Hector Zenil     AIT Tools for Biology and Medicine
Classification of ECA by BDM versus lossless compression




     Compressors have limitations (small sequences, time
     complexity)
     Applications to machine learning
     Problems of classification and clustering
     BDM is computationally efficient (runs in O(nd ) time, hence
     linear (d = 1) time for sequences)

      [H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, (2012)]
                          Hector Zenil   AIT Tools for Biology and Medicine
Asymptotic behaviour of complex systems




                                  [Zenil, Complex Systems (2010)]
                   Hector Zenil   AIT Tools for Biology and Medicine
Rule space of 3-symbol 1D CA




                                  [Zenil, Complex Systems (2011)]
                   Hector Zenil   AIT Tools for Biology and Medicine
Phase transition detection




  Definition
                  |C (Mt (i1 ))−C (Mt (i2 ))|+...+|C (Mt (in−1 ))−C (Mt (in ))|
          ctn =                              t(n−1)


                                              [Zenil, Complex Systems (2011)]
                               Hector Zenil    AIT Tools for Biology and Medicine
A measure of programmability


                                  ∂f (ctn )
                    Ctn (M) =                                          (3)
                                    ∂t




                                  [Zenil, Complex Systems (2011)]


                   Hector Zenil   AIT Tools for Biology and Medicine
Examples




  Figure : ECA Rule 4 has a low Ctn for random chosen n and t (it doesn’t
  react much to external stimuli). limn,t→∞ Ctn (R4) = 0


                          [H. Zenil, Philosophy & Technology, (2013)]
                           Hector Zenil   AIT Tools for Biology and Medicine
Examples (cont.)




  Figure : ECA R110 has large coefficient Ctn value for sensible choices of t
  and n, which is compatible with the fact that it has been proven to be
  capable of universal computation (for particular semi-periodic initial
  configurations). limn,t→∞ Ctn (R110) = 1

                            Hector Zenil   AIT Tools for Biology and Medicine
Classification of graphs




      [Zenil, Soler, Dingle, Graph Automorphism Estimation and Complex
       Network Topological Characterization by Algorithmic Randomness]
                         Hector Zenil   AIT Tools for Biology and Medicine
Characterisation of complex networks




   Complex Networks w/preferential attachment algorithms preserve
  properties invariant under network size (connectedness, robustness)
   at a low cost (unlike costly random nets in the number of links).

      [Zenil, Soler, Dingle, Graph Automorphism Estimation and Complex
       Network Topological Characterization by Algorithmic Randomness]
                          Hector Zenil   AIT Tools for Biology and Medicine
Biological case study: Programmable Porphyrin molecules




  Much about the dynamics of these molecules is known, one can perform
  Monte-Carlo simulations based in these mathematical models and
  establish a correspondence between Wang tiles and simple molecules.

   [joint work with ICOS, U. of Nottingham] [G. Terrazas, H. Zenil and N.
     Krasnogor, Exploring Programmable Self-Assembly in Non DNA-based
                                                   Molecular Computing]
                           Hector Zenil   AIT Tools for Biology and Medicine
Quantitative dynamics of living systems
  Aggregations with similar Kolmogorov complexity cluster in similar
  configurations.




         [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable
                   Self-Assembly in Non DNA-based Molecular Computing]

                            Hector Zenil   AIT Tools for Biology and Medicine
Mapping output behaviour to external stimuli: Parameter
discovery

                Parameter Space P → Target Space T




    Target space T : Set a configuration from P that triggers the
                      desired behaviour in T .
  To investigate:
       Reduction of the parameter space
       Characterisation of the target space
        [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable
                  Self-Assembly in Non DNA-based Molecular Computing]
                          Hector Zenil   AIT Tools for Biology and Medicine
Robustness and pervasiveness
  Concentration changes preserving behaviour:




   Output parameters that have the highest impact can be tested in
               silico before experiments in materio.

        [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable
                  Self-Assembly in Non DNA-based Molecular Computing]
                          Hector Zenil   AIT Tools for Biology and Medicine
Orthogonality

  Specific concentrations producing certain behaviour using the
  mathematical model to be tested against empirical data.




                           Hector Zenil   AIT Tools for Biology and Medicine
Highlights and goals
  Ultimate goal (a few years time): An information-theoretical
  toolbox for systems and synthetic biology




                [Complex3D Proteins Database (graph representation) &
          Z Chen et al. Lung cancer pathways in response to treatments.]

      Pushing boundaries.
      A cutting-edge mathematical approach
      Tools from Complexity theory.

                            Hector Zenil   AIT Tools for Biology and Medicine
New Generation Sequence data analysis



  Heavily driven by:
      Explosion of experimental data
      Difficulties in data interpretation
      New paradigms for knowledge extraction
      Data mining the behaviour of natural systems
      Towards an AIT tool-kit for systems biology, a functional
      library of programmable biological modules with a SBML
      interface.




                         Hector Zenil   AIT Tools for Biology and Medicine
J.P. Delahaye and H. Zenil, On the Kolmogorov-Chaitin complexity
for short sequences, in Cristian Calude (eds), Complexity and
Randomness: From Leibniz to Chaitin, World Scientific, 2007.
J.-P. Delahaye and H. Zenil, Numerical Evaluation of the Complexity
of Short Strings, Applied Mathematics and Computation, 2011.
H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit,
Two-Dimensional Kolmogorov Complexity and Validation of the
Coding Theorem Method by Compressibility, arXiv:1212.6745 [cs.CC]
F. Soler-Toscano, H. Zenil, J.-P. Delahaye and N. Gauvrit,
Correspondence and Independence of Numerical Evaluations of
Algorithmic Information Measures, Numerical Algorithms (in 2nd
revision)
F. Soler-Toscano, H. Zenil, J.-P. Delahaye and N. Gauvrit,
Calculating Kolmogorov Complexity from the Frequency Output
Distributions of Small Turing Machines, arXiv:1211.1302 [cs.IT]
H. Zenil, Compression-based Investigation of the Dynamical
Properties of Cellular Automata and Other Systems, Complex
Systems, Vol. 19, No. 1, pages 1-28, 2010.
                     Hector Zenil   AIT Tools for Biology and Medicine
H. Zenil and J.A.R. Marshall, Some Aspects of Computation
Essential to Evolution and Life, Ubiquity, 2012.
H. Zenil, What is Nature-like Computation? A Behavioural Approach
and a Notion of Programmability, Philosophy & Technology (special
issue on History and Philosophy of Computing), 2013.
H. Zenil, On the Dynamic Qualitative Behavior of Universal
Computation Complex Systems, vol. 20, No. 3, pp. 265-278, 2012.
H. Zenil, A Turing Test-Inspired Approach to Natural Computation
In G. Primiero and L. De Mol (eds.), Turing in Context II (Brussels,
10-12 October 2012), Historical and Contemporary Research in
Logic, Computing Machinery and Artificial Intelligence, Proceedings
published by the Royal Flemish Academy of Belgium for Science and
Arts, 2013.
G.J. Chaitin A Theory of Program Size Formally Identical to
Information Theory, J. Assoc. Comput. Mach. 22, 329-340, 1975.
A. N. Kolmogorov, Three approaches to the quantitative definition
of information Problems of Information and Transmission, 1(1):1–7,
1965.
                     Hector Zenil   AIT Tools for Biology and Medicine
L. Levin, Laws of information conservation (non-growth) and aspects
of the foundation of probability theory, Problems of Information
Transmission, 10(3):206–210, 1974.
M. Li, P. Vit´nyi, An Introduction to Kolmogorov Complexity and Its
             a
Applications, Springer, 3rd. ed., 2008.
R.J. Solomonoff. A formal theory of inductive inference: Parts 1 and
2, Information and Control, 7:1–22 and 224–254, 1964.
S. Wolfram, A New Kind of Science, Wolfram Media, 2002.




                     Hector Zenil   AIT Tools for Biology and Medicine

Weitere ähnliche Inhalte

Was ist angesagt?

Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
Naima Tahsin
 

Was ist angesagt? (20)

lab manual on biophysics, bioinformatics and biostaistics for under graduates...
lab manual on biophysics, bioinformatics and biostaistics for under graduates...lab manual on biophysics, bioinformatics and biostaistics for under graduates...
lab manual on biophysics, bioinformatics and biostaistics for under graduates...
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
BIOL335: Homology search
BIOL335: Homology searchBIOL335: Homology search
BIOL335: Homology search
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
HMM (Hidden Markov Model)
HMM (Hidden Markov Model)HMM (Hidden Markov Model)
HMM (Hidden Markov Model)
 
Drug design
Drug designDrug design
Drug design
 
Bioprocess Equipment Design and Economics
Bioprocess Equipment Design and EconomicsBioprocess Equipment Design and Economics
Bioprocess Equipment Design and Economics
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Introduction to recombination DNA technology
Introduction to recombination DNA technologyIntroduction to recombination DNA technology
Introduction to recombination DNA technology
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 
Enzyme Immobilization and Applications
Enzyme Immobilization and ApplicationsEnzyme Immobilization and Applications
Enzyme Immobilization and Applications
 
Sequenced taged sites (sts)
Sequenced taged sites (sts)Sequenced taged sites (sts)
Sequenced taged sites (sts)
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Structural bioinformatics and pdb
Structural bioinformatics and pdbStructural bioinformatics and pdb
Structural bioinformatics and pdb
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Proteomics ppt
Proteomics pptProteomics ppt
Proteomics ppt
 
Characteristics of biological databases
Characteristics of biological databasesCharacteristics of biological databases
Characteristics of biological databases
 
Radioactive and Non- radioactive probes
Radioactive and Non- radioactive probesRadioactive and Non- radioactive probes
Radioactive and Non- radioactive probes
 
Restriction enzymes
Restriction enzymesRestriction enzymes
Restriction enzymes
 

Andere mochten auch

Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
Lars Juhl Jensen
 

Andere mochten auch (11)

2016 07-19 Gender in Computational Biology
2016 07-19 Gender in Computational Biology2016 07-19 Gender in Computational Biology
2016 07-19 Gender in Computational Biology
 
Mod 13
Mod 13Mod 13
Mod 13
 
Lecture5 xing
Lecture5 xingLecture5 xing
Lecture5 xing
 
PhD Defense, Oldenburg, Germany, June, 2014
PhD Defense, Oldenburg, Germany, June, 2014PhD Defense, Oldenburg, Germany, June, 2014
PhD Defense, Oldenburg, Germany, June, 2014
 
Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...Can you trust the internet? An introduction to graph theory, computational co...
Can you trust the internet? An introduction to graph theory, computational co...
 
Big Data, Computational Biology & the Future of Strategic Planning for Research
Big Data, Computational Biology & the Future of Strategic Planning for ResearchBig Data, Computational Biology & the Future of Strategic Planning for Research
Big Data, Computational Biology & the Future of Strategic Planning for Research
 
The theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscienceThe theory and practice of computational cognitive neuroscience
The theory and practice of computational cognitive neuroscience
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Cognitive Load Theory
Cognitive Load TheoryCognitive Load Theory
Cognitive Load Theory
 

Ähnlich wie Algorithmic Information Theory and Computational Biology

SPIE Conference V3.0
SPIE Conference V3.0SPIE Conference V3.0
SPIE Conference V3.0
Robert Fry
 
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Chyi-Tsong Chen
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Umberto Picchini
 
Robust Immunological Algorithms for High-Dimensional Global Optimization
Robust Immunological Algorithms for High-Dimensional Global OptimizationRobust Immunological Algorithms for High-Dimensional Global Optimization
Robust Immunological Algorithms for High-Dimensional Global Optimization
Mario Pavone
 

Ähnlich wie Algorithmic Information Theory and Computational Biology (20)

Unit 6: All
Unit 6: AllUnit 6: All
Unit 6: All
 
Information Theory and Programmable Medicine
Information Theory and Programmable MedicineInformation Theory and Programmable Medicine
Information Theory and Programmable Medicine
 
Graph Spectra through Network Complexity Measures: Information Content of Eig...
Graph Spectra through Network Complexity Measures: Information Content of Eig...Graph Spectra through Network Complexity Measures: Information Content of Eig...
Graph Spectra through Network Complexity Measures: Information Content of Eig...
 
SPIE Conference V3.0
SPIE Conference V3.0SPIE Conference V3.0
SPIE Conference V3.0
 
PhD defense talk slides
PhD  defense talk slidesPhD  defense talk slides
PhD defense talk slides
 
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
Data Driven Process Optimization Using Real-Coded Genetic Algorithms ~陳奇中教授演講投影片
 
Complexity and Computation in Nature: How can we test for Artificial Life?
Complexity and Computation in Nature: How can we test for Artificial Life?Complexity and Computation in Nature: How can we test for Artificial Life?
Complexity and Computation in Nature: How can we test for Artificial Life?
 
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
 
Cognition, Information and Subjective Computation
Cognition, Information and Subjective ComputationCognition, Information and Subjective Computation
Cognition, Information and Subjective Computation
 
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
M Sc Thesis Presentation Eitan Lavi
M Sc Thesis Presentation   Eitan LaviM Sc Thesis Presentation   Eitan Lavi
M Sc Thesis Presentation Eitan Lavi
 
Robust Immunological Algorithms for High-Dimensional Global Optimization
Robust Immunological Algorithms for High-Dimensional Global OptimizationRobust Immunological Algorithms for High-Dimensional Global Optimization
Robust Immunological Algorithms for High-Dimensional Global Optimization
 
Algorithmic Dynamics of Cellular Automata
Algorithmic Dynamics of Cellular AutomataAlgorithmic Dynamics of Cellular Automata
Algorithmic Dynamics of Cellular Automata
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmission
 
A2100106
A2100106A2100106
A2100106
 
Presentation ECMTB14
Presentation ECMTB14Presentation ECMTB14
Presentation ECMTB14
 
2008: Natural Computing: The Virtual Laboratory and Two Real-World Applications
2008: Natural Computing: The Virtual Laboratory and Two Real-World Applications2008: Natural Computing: The Virtual Laboratory and Two Real-World Applications
2008: Natural Computing: The Virtual Laboratory and Two Real-World Applications
 
Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Netw...
Calibrating the Lee-Carter and the Poisson Lee-Carter models  via Neural Netw...Calibrating the Lee-Carter and the Poisson Lee-Carter models  via Neural Netw...
Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Netw...
 
Causal Effect Inference with Deep Latent-Variable Models
Causal Effect Inference with Deep Latent-Variable ModelsCausal Effect Inference with Deep Latent-Variable Models
Causal Effect Inference with Deep Latent-Variable Models
 

Mehr von Hector Zenil

Mehr von Hector Zenil (20)

Unit 2: All
Unit 2: AllUnit 2: All
Unit 2: All
 
Unit 5: All
Unit 5: AllUnit 5: All
Unit 5: All
 
4.11: The Infinite Programming Monkey
4.11: The Infinite Programming Monkey4.11: The Infinite Programming Monkey
4.11: The Infinite Programming Monkey
 
4.12: The Algorithmic Coding Theorem
4.12: The Algorithmic Coding Theorem4.12: The Algorithmic Coding Theorem
4.12: The Algorithmic Coding Theorem
 
4.13: Bennett's Logical Depth: A Measure of Sophistication
4.13: Bennett's Logical Depth: A Measure of Sophistication 4.13: Bennett's Logical Depth: A Measure of Sophistication
4.13: Bennett's Logical Depth: A Measure of Sophistication
 
4.10: Algorithmic Probability & the Universal Distribution
4.10: Algorithmic Probability & the Universal Distribution 4.10: Algorithmic Probability & the Universal Distribution
4.10: Algorithmic Probability & the Universal Distribution
 
4.9: The Chaitin-Leibniz Medal
4.9: The Chaitin-Leibniz Medal4.9: The Chaitin-Leibniz Medal
4.9: The Chaitin-Leibniz Medal
 
4.8: Epistemological Aspects of Infinite Wisdom
4.8: Epistemological Aspects of Infinite Wisdom4.8: Epistemological Aspects of Infinite Wisdom
4.8: Epistemological Aspects of Infinite Wisdom
 
4.7: Chaitin's Omega Number
4.7: Chaitin's Omega Number4.7: Chaitin's Omega Number
4.7: Chaitin's Omega Number
 
4.6: Convergence of Definitions
4.6: Convergence of Definitions4.6: Convergence of Definitions
4.6: Convergence of Definitions
 
4.5: The Invariance Theorem
4.5: The Invariance Theorem4.5: The Invariance Theorem
4.5: The Invariance Theorem
 
4.4: Algorithmic Complexity and Compressibility
4.4: Algorithmic Complexity and Compressibility4.4: Algorithmic Complexity and Compressibility
4.4: Algorithmic Complexity and Compressibility
 
4.3: Pseudo Randomness
4.3: Pseudo Randomness4.3: Pseudo Randomness
4.3: Pseudo Randomness
 
4.1 Sources of Randomness
4.1 Sources of Randomness4.1 Sources of Randomness
4.1 Sources of Randomness
 
Unit 3: Classical information
Unit 3: Classical informationUnit 3: Classical information
Unit 3: Classical information
 
Unit 3: Shannon entropy and meaning
Unit 3: Shannon entropy and meaningUnit 3: Shannon entropy and meaning
Unit 3: Shannon entropy and meaning
 
Unit 3: Joint, Conditional, Mutual Information, & a Case Study
Unit 3: Joint, Conditional, Mutual Information, & a Case StudyUnit 3: Joint, Conditional, Mutual Information, & a Case Study
Unit 3: Joint, Conditional, Mutual Information, & a Case Study
 
Unit 3: Entropy rate, languages and multidimensional data
Unit 3: Entropy rate, languages and multidimensional dataUnit 3: Entropy rate, languages and multidimensional data
Unit 3: Entropy rate, languages and multidimensional data
 
Unit 3: Redundancy, noise, and biological information
Unit 3: Redundancy, noise, and biological informationUnit 3: Redundancy, noise, and biological information
Unit 3: Redundancy, noise, and biological information
 
Fractal dimension versus Computational Complexity
Fractal dimension versus Computational ComplexityFractal dimension versus Computational Complexity
Fractal dimension versus Computational Complexity
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Kürzlich hochgeladen (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 

Algorithmic Information Theory and Computational Biology

  • 1. Algorithmic Information Theory and Computational Biology Hector Zenil Unit of Computational Medicine Karolinska Institutet Sweden Hector Zenil AIT Tools for Biology and Medicine
  • 2. Complex Adaptive Systems (CAS) Hector Zenil AIT Tools for Biology and Medicine
  • 3. Complexity is hard to quantify in biology Mapping quantitative stimuli to qualitative behaviour Hector Zenil AIT Tools for Biology and Medicine
  • 4. Information Theory in Biology Sequence alignment Pattern recognition Sequence logos Binding site detection Motif detection Consensus sequences Biological significance [based on Claude Shannon’s Information Theory, 1940] Hector Zenil AIT Tools for Biology and Medicine
  • 5. Algorithmic Information Theory Which sequence looks more random? (a) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (b) AGGTCGTGAAGTGCGATGGCCTTACGTAGC (c) GCGCGCGCGCGCGCGCGCGCGCGCGCGC Classical probability theory vs. Kolmogorov Complexity Definition KU (s) = min{|p|, U(p) = s} (1) Compressibility A sequence with low Kolmogorov complexity is c-compressible if |p| + c = |s|. A sequence is random if K (s) ≈ |s|. [Kolmogorov (1965); Chaitin (1966)] Hector Zenil AIT Tools for Biology and Medicine
  • 6. Examples Example 1 Sequences like (a) have low algorithmic complexity because they allow a short description. For example, “20 times A”. No matter how long (a) grows in length, the description increases only by about log2 (k) (k times A). Example 2 The sequence (b) is algorithmic random because it doesn’t seem to allow a (much) shorter description other than the length of (b) itself. For example, for sequence (a), a proof of non-randomness implies the exhibition of a short program. Compressibility is therefore a sufficient test of non-randomness. Hector Zenil AIT Tools for Biology and Medicine
  • 7. Example of an evaluation of K The sequence (b) GCGCGC...GC is not algorithmic random (or has low K complexity) because it can be produced by the following program (take G=0 and C=1): Program A(i): 1: n:= 0 2: Print n mod 2 3: n:= n+1 4: If n=i Goto 6 5: Goto 2 6: End The length of A (in bits) is an upper bound of K (GCGCGC ...GC ). Hector Zenil AIT Tools for Biology and Medicine
  • 8. The ultimate measure of pattern detection and optimal prediction Kolmogorov and Chaitin, Schnorr, and Martin-L¨fo independently provided 3 different approaches to randomness (compression, predictability and typicality). They proved (for infinite sequences): incompressibility ⇐⇒ unpredictability ⇐⇒ typicality When this happens in mathematics a concept has objectively been captured (randomness). This is why prediction in biology is hard. AIT tells that no effective statistical test will succeed to recognise all patterns and no computable technique can fully predict all outcomes. The problem is deeply connected to computability and algorithmic information theory. [Solomonoff (1964); Kolmogorov (1965); Chaitin (1969)] Hector Zenil AIT Tools for Biology and Medicine
  • 9. Information distances and similarity metrics Measures waiting to be introduced in bioinformatics Information Distance ID(x, y ) = max K (x|y ), K (y |x) Universal Similarity Metric USM(x, y ) = max K (x|y ), K (y |x)/ max K (x), K (y ) Normalised Information Distance: NCD(x, y ) = K (xy ) − min K (x), K (y )/ max K (x), K (y ) and NCD. Normalized Compression Measure (NCM): NC (s) = K (s)/|s| (asymptotic behaviour) Bennett’s Logical Depth: LDd (s) = min{t(p) : (|p| − |p ∗ | < d) and (U(p) = s)} (e.g. of an app. see Zenil, Complexity 2011) Hector Zenil AIT Tools for Biology and Medicine
  • 10. Non-systematic but succesful attempts in biology GenCompress is a compression algorithm to compress DNA sequences: d(x, y ) = 1 − (K (x) − K (x|y ))/K (xy ) NCD applied to genetic similarity: AIT looks at the genome as information, not as data (letters). Counting: traditional Shannon-entropy style sequencing. Interpreting: AIT. The full power of the theory hasn’t yet been unleashed. Hector Zenil AIT Tools for Biology and Medicine
  • 11. To be or not to be... Borel’s “Infinite Monkey” theorem Input 1 0 1024 π Syntax error √2 ∞ CH3 ∞ “To be or not to be, that is the question.” Hector Zenil AIT Tools for Biology and Medicine
  • 12. Algorithmic probability Hector Zenil AIT Tools for Biology and Medicine
  • 13. Producing π This C-language code produces the first 1000 digits of π (Gjerrit Meinsma): long k = 4e3, p, a[337], q, t = 1e3; main(j){for (; a[j = q = 0]+ = 2, k; ) for (p = 1 + 2 ∗ k; j < 337; q = a[j] ∗ k + q%p ∗ t, a[j + +] = q/p) k! = j > 2? : printf (“%.3d”, a[j2]%t + q/p/t); } Producing non-random sequences: If an object has low Kolmogorov complexity then it has a short description and a greater probability to be produced by a random program. The less random a string the more likely to be produced by a short program. Hector Zenil AIT Tools for Biology and Medicine
  • 14. Biological Big Data Analysis The information bottleneck: Small Data matters: Local measurements of information content are a good indication of the global information content of an object. Evidence: BDM Image classification. Compression works at large scales looking for long regularities, while BDM is very local. Yet both yield astonishing similar results for this object sizes. Hector Zenil AIT Tools for Biology and Medicine
  • 15. Complementary methods for different sequence lengths The methods to approximate K coexist and complement each other for different sequence lengths. short strings long strings scalability < 100 bits > 100 bits Lossless compression √ √ method × Coding Theorem √ method × × Block Decomposition √ √ √ method [Zenil, Soler, Delahaye, Gauvrit, Two-Dimensional Kolmogorov Complexity and Validation of the Coding Theorem Method by Compressibility (2012)] Hector Zenil AIT Tools for Biology and Medicine
  • 16. Coding Theorem method and lossless compression The transition between one method and the other. What is complex for the Coding Theorem method is less compressible. [Soler, Zenil, Delahaye, Gauvrit, Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures (2012)] Hector Zenil AIT Tools for Biology and Medicine
  • 17. Online Algorithmic Complexity Calculator Provides: Shannon’s entropy, lossless compression (Deflate) values, Kolmogorov complexity approximations and relative frequency order (algorithmic probability). A Mathematica API and an R module. Datasets available online at the Dataverse Network. Basic data analysis tool for shorts sequence comparison. [http://www.complexitycalculator.com] Hector Zenil AIT Tools for Biology and Medicine
  • 18. Online Algorithmic Complexity Calculator 2 [http://www.complexitycalculator.com] Hector Zenil AIT Tools for Biology and Medicine
  • 19. Simulation of natural systems w/complex symbolic systems An elementary cellular automaton (ECA) is defined by a local function f : {0, 1}3 → {0, 1}, f maps the state of a cell and its two immediate neighbours (range = 1) to a new cell state: ft : r−1 , r0 , r+1 → r0 . Cells are updated synchronously according to f over all cells in a row. [Wolfram, (1994)] Hector Zenil AIT Tools for Biology and Medicine
  • 20. Behavioural classes of CA Wolfram’s classes of behaviour: Class I: Systems evolve into a stable state. Class II: Systems evolve in a periodic (e.g. fractal) state. Class III: Systems evolve into random-looking states. Class IV: Systems evolve into localised complex structures. e.g. Rule 110 or the Game of Life. [Wolfram, (1994)] Hector Zenil AIT Tools for Biology and Medicine
  • 21. Block Decomposition method (BDM) The Block Decomposition method uses the Coding Theorem method. Formally, we will say that an object c has complexity: K logm,2Dd×d (c) = (nu − 1) log2 (Km,2D (ru )) + Km,2D (ru ) (ru ,nu )∈cd×d (2) where cd×d represents the set with elements (ru , nu ), obtained from decomposing the object into blocks of d × d with boundary conditions. In each (ru , nu ) pair, ru is one of such squares and nu its multiplicity. [H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, (2012)] Hector Zenil AIT Tools for Biology and Medicine
  • 22. Classification of ECA by BDM versus lossless compression Compressors have limitations (small sequences, time complexity) Applications to machine learning Problems of classification and clustering BDM is computationally efficient (runs in O(nd ) time, hence linear (d = 1) time for sequences) [H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, (2012)] Hector Zenil AIT Tools for Biology and Medicine
  • 23. Asymptotic behaviour of complex systems [Zenil, Complex Systems (2010)] Hector Zenil AIT Tools for Biology and Medicine
  • 24. Rule space of 3-symbol 1D CA [Zenil, Complex Systems (2011)] Hector Zenil AIT Tools for Biology and Medicine
  • 25. Phase transition detection Definition |C (Mt (i1 ))−C (Mt (i2 ))|+...+|C (Mt (in−1 ))−C (Mt (in ))| ctn = t(n−1) [Zenil, Complex Systems (2011)] Hector Zenil AIT Tools for Biology and Medicine
  • 26. A measure of programmability ∂f (ctn ) Ctn (M) = (3) ∂t [Zenil, Complex Systems (2011)] Hector Zenil AIT Tools for Biology and Medicine
  • 27. Examples Figure : ECA Rule 4 has a low Ctn for random chosen n and t (it doesn’t react much to external stimuli). limn,t→∞ Ctn (R4) = 0 [H. Zenil, Philosophy & Technology, (2013)] Hector Zenil AIT Tools for Biology and Medicine
  • 28. Examples (cont.) Figure : ECA R110 has large coefficient Ctn value for sensible choices of t and n, which is compatible with the fact that it has been proven to be capable of universal computation (for particular semi-periodic initial configurations). limn,t→∞ Ctn (R110) = 1 Hector Zenil AIT Tools for Biology and Medicine
  • 29. Classification of graphs [Zenil, Soler, Dingle, Graph Automorphism Estimation and Complex Network Topological Characterization by Algorithmic Randomness] Hector Zenil AIT Tools for Biology and Medicine
  • 30. Characterisation of complex networks Complex Networks w/preferential attachment algorithms preserve properties invariant under network size (connectedness, robustness) at a low cost (unlike costly random nets in the number of links). [Zenil, Soler, Dingle, Graph Automorphism Estimation and Complex Network Topological Characterization by Algorithmic Randomness] Hector Zenil AIT Tools for Biology and Medicine
  • 31. Biological case study: Programmable Porphyrin molecules Much about the dynamics of these molecules is known, one can perform Monte-Carlo simulations based in these mathematical models and establish a correspondence between Wang tiles and simple molecules. [joint work with ICOS, U. of Nottingham] [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable Self-Assembly in Non DNA-based Molecular Computing] Hector Zenil AIT Tools for Biology and Medicine
  • 32. Quantitative dynamics of living systems Aggregations with similar Kolmogorov complexity cluster in similar configurations. [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable Self-Assembly in Non DNA-based Molecular Computing] Hector Zenil AIT Tools for Biology and Medicine
  • 33. Mapping output behaviour to external stimuli: Parameter discovery Parameter Space P → Target Space T Target space T : Set a configuration from P that triggers the desired behaviour in T . To investigate: Reduction of the parameter space Characterisation of the target space [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable Self-Assembly in Non DNA-based Molecular Computing] Hector Zenil AIT Tools for Biology and Medicine
  • 34. Robustness and pervasiveness Concentration changes preserving behaviour: Output parameters that have the highest impact can be tested in silico before experiments in materio. [G. Terrazas, H. Zenil and N. Krasnogor, Exploring Programmable Self-Assembly in Non DNA-based Molecular Computing] Hector Zenil AIT Tools for Biology and Medicine
  • 35. Orthogonality Specific concentrations producing certain behaviour using the mathematical model to be tested against empirical data. Hector Zenil AIT Tools for Biology and Medicine
  • 36. Highlights and goals Ultimate goal (a few years time): An information-theoretical toolbox for systems and synthetic biology [Complex3D Proteins Database (graph representation) & Z Chen et al. Lung cancer pathways in response to treatments.] Pushing boundaries. A cutting-edge mathematical approach Tools from Complexity theory. Hector Zenil AIT Tools for Biology and Medicine
  • 37. New Generation Sequence data analysis Heavily driven by: Explosion of experimental data Difficulties in data interpretation New paradigms for knowledge extraction Data mining the behaviour of natural systems Towards an AIT tool-kit for systems biology, a functional library of programmable biological modules with a SBML interface. Hector Zenil AIT Tools for Biology and Medicine
  • 38. J.P. Delahaye and H. Zenil, On the Kolmogorov-Chaitin complexity for short sequences, in Cristian Calude (eds), Complexity and Randomness: From Leibniz to Chaitin, World Scientific, 2007. J.-P. Delahaye and H. Zenil, Numerical Evaluation of the Complexity of Short Strings, Applied Mathematics and Computation, 2011. H. Zenil, F. Soler-Toscano, J.-P. Delahaye and N. Gauvrit, Two-Dimensional Kolmogorov Complexity and Validation of the Coding Theorem Method by Compressibility, arXiv:1212.6745 [cs.CC] F. Soler-Toscano, H. Zenil, J.-P. Delahaye and N. Gauvrit, Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures, Numerical Algorithms (in 2nd revision) F. Soler-Toscano, H. Zenil, J.-P. Delahaye and N. Gauvrit, Calculating Kolmogorov Complexity from the Frequency Output Distributions of Small Turing Machines, arXiv:1211.1302 [cs.IT] H. Zenil, Compression-based Investigation of the Dynamical Properties of Cellular Automata and Other Systems, Complex Systems, Vol. 19, No. 1, pages 1-28, 2010. Hector Zenil AIT Tools for Biology and Medicine
  • 39. H. Zenil and J.A.R. Marshall, Some Aspects of Computation Essential to Evolution and Life, Ubiquity, 2012. H. Zenil, What is Nature-like Computation? A Behavioural Approach and a Notion of Programmability, Philosophy & Technology (special issue on History and Philosophy of Computing), 2013. H. Zenil, On the Dynamic Qualitative Behavior of Universal Computation Complex Systems, vol. 20, No. 3, pp. 265-278, 2012. H. Zenil, A Turing Test-Inspired Approach to Natural Computation In G. Primiero and L. De Mol (eds.), Turing in Context II (Brussels, 10-12 October 2012), Historical and Contemporary Research in Logic, Computing Machinery and Artificial Intelligence, Proceedings published by the Royal Flemish Academy of Belgium for Science and Arts, 2013. G.J. Chaitin A Theory of Program Size Formally Identical to Information Theory, J. Assoc. Comput. Mach. 22, 329-340, 1975. A. N. Kolmogorov, Three approaches to the quantitative definition of information Problems of Information and Transmission, 1(1):1–7, 1965. Hector Zenil AIT Tools for Biology and Medicine
  • 40. L. Levin, Laws of information conservation (non-growth) and aspects of the foundation of probability theory, Problems of Information Transmission, 10(3):206–210, 1974. M. Li, P. Vit´nyi, An Introduction to Kolmogorov Complexity and Its a Applications, Springer, 3rd. ed., 2008. R.J. Solomonoff. A formal theory of inductive inference: Parts 1 and 2, Information and Control, 7:1–22 and 224–254, 1964. S. Wolfram, A New Kind of Science, Wolfram Media, 2002. Hector Zenil AIT Tools for Biology and Medicine