SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Areejit Samal
International Centre for Theoretical
Physics
Trieste, ITALY
asamal@ictp.it
Phenotypic constraints drive the
architecture of biological networks
Work done @
CNRS, LPTMS, Univ Paris-Sud 11, Orsay, France
&
Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
Boehringer Mannheim
Biochemical Pathway Chart
Metabolic network
Metabolic network consists the set of biochemical reactions that convert nutrient
molecules into key molecules required for growth and maintenance of the cell.
Large-scale structure of metabolic networks is similar to
other real world networks
Ma and Zeng (2003);
Csete and Doyle (2004)
Wagner & Fell (2001) Jeong et al (2000) Ravasz et al (2002)
Small-world, Scale-free and Hierarchical Organization
Bow-tie architecture
Bow-tie architecture of the metabolism is similar to that
found in the WWW by Broder et al.
Small Average Path Length, High Local Clustering, Power-law degree distribution
Directed graph with a giant component of strongly
connected nodes along with associated IN and OUT
component.
 Structure of real networks is very different from random
(Erdős–Rényi) networks.
 Large-scale structure of biological networks is similar to
other kinds of real networks (social and technological
networks).
Reference: Shen-Orr et al (2002); Milo et al (2002,2004)
Local structure of biological networks: Network motifs
Sub-graphs that are over-represented in real
networks compared to randomized networks.
Certain network motifs were shown to perform
important information processing tasks.
E.g., the coherent Feed Forward Loop (FFL)
can be used to detect persistent signals.
The preponderance of certain sub-graphs in the real network may reflect evolutionary
design principles favored by selection.
Are the observed structural features of a metabolic
network ‘unusual’ or ‘atypical’?
 Extremely popular to study network measures such as degree
distribution, clustering coefficient, assortativity, motifs, etc. in biological
networks.
 But mere observation of scale-free, small-world or other striking
structural features in biological networks do not mean these properties
are non-obvious or atypical features of cellular networks.
 A distinction needs to be made between observation of these structural
properties in real networks and our understanding of the generative
principles that may have led to these properties in real networks.
Questions of Interest
 Are the observed structural properties of biological networks frozen
accidents?
 What are the adaptive features of a network? Which features of a
biological network are under selection and which are mere byproducts of
selection on other traits?
To answer these questions adequately, one needs proper controls or null
models of biological networks.
 Current controls are not well posed for biological
networks, especially, metabolic networks.
 Furthermore, proper controls should account for biological function.
What are the non-obvious or atypical properties of metabolic networks once
biological function is accounted in the control or null model?
Is the large-scale structure of E. coli metabolic network
atypical?
• Metabolite Degree distribution
• Clustering Coefficient
• Average Path Length
• Pc: Probability that a path exists between two nodes in
the directed graph
• Largest strongly component (LSC) and the union of
LSC, IN and OUT components
We decided to compare the following structural properties of E. coli
metabolic network with those in randomized metabolic networks:
Bow-tie architecture of the
Internet and metabolism
Ref: Broder et al (1999); Ma and Zeng (2003)
Scale Free and Small World
Ref: Jeong et al (2000); Wagner & Fell (2001)
The widely used null model to generate
randomized metabolic networks is not well posed
to answer this question!
Chosen Property
0.60 0.62 0.64 0.66 0.68 0.70
Frequency
0
50
100
150
200
250
300
Edge-randomization: Widely-used null model that does not
account for biological function
1) Measure the ‘chosen property’ in the
investigated real network.
2) Generate randomized networks with
structure similar to the investigated real
network using edge-randomization.
3) Use the distribution of the ‘chosen
property’ for the randomized networks to
estimate a p-value.
Investigated
network
p-value
Investigated Network After 1 exchange After 2 exchanges
Reference:
Milo et al
(2002,2004);
Maslov &
Sneppen (2002)
BUT this null model does not account for phenotypic or functional
constraints central to cellular networks!
Edge-randomization: Does not preserve mass balance constraint
and unsuitable for metabolic networks
ASPT CITL
fum
asp-L cit
ac oaanh4
ASPT* CITL*
fum
asp-L cit
ac oaanh4
ASPT: asp-L → fum + nh4
CITL: cit → oaa + ac
ASPT*: asp-L → ac + nh4
CITL*: cit → oaa + fum
Preserves degree of each node in the network but generates fictitious reactions
that violate mass, charge and atomic balance satisfied by real chemical
reactions!!
Note that fum has 4 carbon atoms while ac has 2 carbon atoms in the example shown.
Null-model used in
many studies
including: Guimera &
Amaral Nature (2005)
Biochemically meaningless randomization inappropriate for metabolic networks
We decided to develop a proper null model for metabolic networks that accounts
for basic biochemical and functional constraints.
Global reaction set ensures mass balance constraint
Limitation 1: Edge-randomization generates
randomized reactions which violate mass and atomic
balance.
Solution: We decided to overcome this problem by
limiting the set of reactions in random metabolic
networks to those within KEGG database.
KEGG Database + E. coli iJR904
We have use a curated
database of 5870 mass
balanced reactions derived
from KEGG by Rodrigues and
Wagner (2009).
Bit string representation of metabolic network
6000
900C
KEGG Database E. coli
R1: 3pg + nad → 3php + h + nadh
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o
.
.
.
RN: ---------------------------------
1
0
1
.
.
.
.
The E. coli metabolic network or any random network of reactions within KEGG
can be represented as a bit string of length N = 5870 reactions with exactly n
entries equal to 1 where n is the number of reactions in E. coli.
Present - 1
Absent - 0
N = 5870 reactions within KEGG
(or global reaction set)
Contains n reactions
(1‟s in the bit string)
Constraint-based Flux Balance Analysis (FBA): Determine the
functional capability of metabolic networks
List of metabolic reactions with
stoichiometric coefficients
Biomass composition
Set of nutrients in the
environment
Flux Balance
Analysis (FBA)
Growth rate for
the given medium
‘Biomass Yield’
Fluxes of all
reactions
Advantages
FBA does not require enzyme kinetic information which is not known for most reactions.
Disadvantages
FBA cannot predict internal metabolite concentrations and is restricted to steady states.
Basic models do not account for metabolic regulation.
Reference: Varma and Palsson, Biotechnology (1994); Price et al (2004)
Input
Genotype: Set of Metabolic Reactions
Environment: Set of nutrients available for uptake
Output
Phenotype: Ability to grow
Metabolic Genotype to Phenotype Map
0
1
1
.
.
.
.
0
0
1
.
.
.
.
GA
GB
x
Deleted
reaction
Flux
Balance
Analysis
(FBA)
Viable
genotype
Unviable
genotype
Bit string and equivalent network
representation of genotypes
Metabolic Genotypes Ability to synthesize Viability
biomass components
Growth
No Growth
Limitation 2: Edge-randomization does not account for biological function.
Solution: Flux Balance Analysis can be used to determine functionality of metabolic
networks.
Fraction of networks satisfying functional constraints
Number of reactions in genotype
2000 2200 2400 2600 2800
Fractionofgenotypesthatareviable
100
10-5
10-10
10-15
10-20
Glucose
Acetate
Succinate
Analytic prefactor
Answer:
To estimate this fraction:
• Generate random networks within KEGG
with exactly n reactions.
• Determine the fraction of networks that
can grow under glucose minimal media
using FBA.
Question: What fraction of possible metabolic networks within KEGG with exactly n
reactions can grow under glucose minimal media like E. coli?
Only a tiny fraction of possible networks satisfy functional constraints yet the
number of functional networks is huge !
Reference: Samal et al., BMC Systems Biology (2010)
Space of possible networks with number of reactions equal to E. coli ~ 101113
Space of possible networks with number of reactions equal to E. coli and desired
phenotype ~ 101050
Necessity for Markov Chain Monte Carlo (MCMC) Sampling
Space of possible networks ~ 101113
Space of networks with desired
phenotype ~ 101050
We wish to uniformly sample
the space of viable metabolic
networks within KEGG having
the same number of reactions
as E. coli.
But one cannot sample the
viable space by generating
random bit strings followed by a
test for phenotype.
We resort to Markov Chain
Monte Carlo (MCMC) method
to sample the space of
networks satisfying functional
constraints.
MCMC sampling of metabolic networks with growth
phenotype
Accept/Reject Criterion of reaction swap:
Accept if
(a) No. of metabolites in the new network is less than or equal to E. coli
(b) New network is able to grow on the specified environment(s)
KEGG Database
R1: 3pg + nad → 3php + h + nadh
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o
.
.
.
RN: ---------------------------------
1
0
1
0
1
.
.
N = 5870 reactions within KEGG
reaction universe
1
1
0
0
1
.
.
1
0
1
1
0
.
.
1
1
1
0
0
.
.
0
0
1
0
1
.
.
E. coli
Accept
Reject
Step 1
Swap Swap
Reject
105 Swaps
Step 2
103 Swaps
store store
Erase
memory of
initial
network
Saving
Frequency
A reaction swap can be decomposed into two mutations:
a) First mutation leading to deletion of reaction
b) Second mutation leading to addition of reaction
A reaction swap preserves the number of reactions in the randomized network.
Randomizing metabolic networks
We have developed a new method using Markov Chain Monte Carlo (MCMC)
sampling and Flux Balance Analysis (FBA) to generate meaningful randomized
ensembles for metabolic networks by successively imposing constraints.
Ensemble R
Given no. of valid
biochemical reactions
Ensemble RM
Fix the no. of metabolites
Ensemble uRM
Exclude Blocked Reactions
Ensemble uRM-V1
Functional constraint of
growth on specified environments
Increasinglevelofconstraints
Constraint I
Constraint II
Constraint III
Constraint IV
+
+
+
Reference: Samal et al., BMC Systems Biology (2010); Samal and Martin, PLoS ONE (2011)
Metabolite degree distribution
k
1 10 100 1000
P(k)
10-6
10-5
10-4
10-3
10-2
10-1
100
E. coli
Random
Sampled random networks are scale-free like E. coli !
Degree of a metabolite is the number of reactions in which the metabolite
participates in the network.
Clustering coefficient and Average Path Length
Sampled random networks have small-world
and hierarchical architecture!
R: Fixed no. of reactions
RM: Fixed no. of reactions and
metabolites
uRM: Fixed no. of unblocked reactions
and metabolites
uRM-V1: Fixed no. of unblocked reactions
and metabolites and viable in one
environment
uRM-V5: Fixed no. of unblocked reactions
and metabolites and viable in five
environments
uRM-V10: Fixed no. of unblocked reactions
and metabolites and viable in ten
environments
Probability that a path exists between two nodes and
Size of largest strong component
In a directed network, the may exist path from
node a to f but lack a path back from node f to a.
Probability that a path exists between two nodes is
an important quantity characterizing directed
networks.
A strongly connected component is a maximal set
of nodes such that for any pair of nodes a and b in
the set there is a directed path from a to b and
from b to a.
The size of largest strong component is an
important characteristic of directed networks.
Sampled random networks have a bow-tie architecture similar to real network!
Global structural properties of real metabolic networks are a
consequence of simple biochemical and functional constraints
Increasinglevelofconstraints
Reference: Samal and Martin, PLoS ONE (2011)
Conjectured by:
A. Wagner (2007) and
B. Papp et al. (2008)
Biological function is the main
driver of large-scale structure in
metabolic networks.
Additional constraints reduce the space of possible
networks
Reference: Samal et al, BMC Systems Biology (2010)
Ensemble R
Ensemble Rm
Ensemble uRm
Ensemble uRm-V1
Ensemble uRm-V5
Ensemble uRm-V10
Embedded sets like Russian dolls
Including the viability constraint on the first chemical environment leads to a
reduction by at least a factor 1050
High level of genetic diversity in our randomized ensembles
Any two random networks in our most constrained ensemble uRM-v10 differ in
~ 60% of their reactions.
1
0
1
.
.
.
.
1
0
0
.
.
.
.
E. coli Random network in
ensemble uRM-V10
versus
Hamming distance between the two networks is ~ 60% of the maximum
possible between two bit strings.
1
0
1
.
.
.
.
1
0
0
.
.
.
.
Random network in
ensemble uRM-V10
versus
Random network in
ensemble uRM-V10
Origins of Power law degree distribution: gene
duplication and divergence
Reference: Barabasi and Oltvai (2004)
In Barabasi and Albert model, scale-free networks
emerge through two basic mechanisms: growth
and preferential attachment.
It has been suggested that a combination of gene
duplication and divergence can explain power
laws observed in protein interaction networks and
this line of argument has been extended to explain
power laws in metabolic networks.
Reaction Degree Distribution
Number of substrates in a reaction
0 2 4 6 8 10 12
Frequency
0
500
1000
1500
2000
2500
3000
0 Currency
1 Currency
2 Currency
3 Currency
4 Currency
5 Currency
6 Currency
7 Currency
In each reaction, we have counted the number
of currency and other metabolites.
Most reactions involve 4 metabolites of which 2
are currency metabolites and 2 are other
metabolites.
The nature of reaction degree distribution is very different from the metabolite
degree distribution which follows a power law.
The degree of a reaction is the number of metabolites that participate in it.
The reaction degree distribution is bell shaped with typical reaction in the network
involving 4 metabolites.
R
atp
A
adp
B
Scale-free versus Scale-rich network:
Origin of Power laws in metabolic networks
Tanaka and Doyle have suggested a classification of metabolites into three
categories based on their biochemical roles
(a) „Carriers‟ (very high degree)
(b) „Precursors‟ (intermediate degree)
(c) „Others‟ (low degree)
Gene duplication and divergence at the
level of protein interaction networks has
been suggested as an explanation for
observed power laws in metabolic
networks.
However, the presence of ubiquitous
(high degree) „currency‟ metabolites
along with low degree „other‟
metabolites in each reaction can
alternatively explain the observed power
laws (or at least fat tail) in the
metabolite degree distribution.
Reference: Tanaka (2005); Tanaka, Csete, and Doyle (2008)
R
atp
A
adp
B
MCMC sampling method: A computational framework to
address questions in evolutionary systems biology
Reference: Papp et al (2008)
Genotype networks: A many-to-one genotype to phenotype map
Genotype
Phenotype
RNA Sequence
Secondary structure
of pairings
folding
Reference: Fontana & Schuster (1998) Lipman & Wilbur (1991) Ciliberti, Martin & Wagner (2007)
In the context of RNA, genotype networks are commonly referred to as Neutral networks.
We have studied the properties of the metabolic genotype-phenotype map using our
framework in detail.
Protein Sequence
Three dimensional
structure
Gene network
Gene expression
levels
Reference: Samal et al, BMC Systems Biology (2010)
Implications: Origins of Evolutionary Innovations
 “How do organisms maintain existing phenotype
while exploring for new and better adapted
phenotypes?”
 Darwin‟s theory explains the survival of the fittest
but does not explain the arrival of the fittest.
 Our MCMC simulations show that: "Starting with the
reference genotype, one can evolve to very different
genotypes with gradual small changes while
maintaining the existing phenotype“.
 Neighborhoods of different genotypes with a given
phenotype can have access to very different novel
phenotypes.
Properties of Metabolic Genotype to Phenotype map
 Any one phenotype is adopted by a vast number of genotypes.
 These genotypes form large connected sets in genotype space and it is
possible to reach any genotype in such a connected set from any other
genotype by a series of small genotypic changes.
 Any one genotype network typically extends far through genotype space.
 Individual genotypes on any one genotype network typically have
multiple neighbors with the same phenotype. Put differently, phenotypes
are to some extent robust to small changes in genotypes (such as
mutations).
 Different neighborhoods of the same genotype network contain very
different novel phenotypes.
Floral Organ Specification (FOS) gene regulatory network
We have developed a Markov Chain Monte Carlo (MCMC) method similar to the
case of metabolic networks to sample the space of functional gene regulatory
networks.
Phenotype: 10 steady-state attractors of
the network whose expression states
specify different cell types (shoot apical
meristem, sepal, petal, stamen and
carpel).
Genotype: Boolean Gene Regulatory
Network model for Arabidopsis Floral
Organ Specification network containing
15 genes connected by 46 edges.
Reference: Alvarez-Buylla et al, The Arabidopsis Book (2010)
Edge usage in real and sampled networks with functional
constraints
Reference: Henry, Moneger, Samal* and Martin*, Mol. Biosystems (2013)
Arabidopsis network Sampled networks with phenotype
constraints
Summary
 We have proposed a null model based on Markov Chain Monte Carlo (MCMC)
sampling to generate benchmark ensembles with desired phenotype for
metabolic networks.
 Our realistic benchmark ensembles can be used to distinguish between 'typical'
and 'atypical' properties of a network.
 We show that many large-scale structural properties of metabolic networks are
by-products of functional or phenotypic constraints.
 Modularity in metabolic networks can arise due to phenotypic constraints of
growth in many different environments.
 Our framework can be used to address many questions in evolutionary systems
biology.
Acknowledgements
Andreas Wagner João Rodrigues
Jürgen Jost
Olivier Martin Adrien Henry
Françoise Monéger

Weitere ähnliche Inhalte

Was ist angesagt?

Computational Analysis with ICM
Computational Analysis with ICMComputational Analysis with ICM
Computational Analysis with ICM
Vernon D Dutch Jr
 
Internship Report
Internship ReportInternship Report
Internship Report
Neha Gupta
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
Lars Juhl Jensen
 

Was ist angesagt? (20)

Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interaction
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
Integrating Markov Process with Structural Causal Modeling Enables Counterfac...
Integrating Markov Process with Structural Causal Modeling Enables Counterfac...Integrating Markov Process with Structural Causal Modeling Enables Counterfac...
Integrating Markov Process with Structural Causal Modeling Enables Counterfac...
 
Ppi
PpiPpi
Ppi
 
Digital Cells
Digital CellsDigital Cells
Digital Cells
 
Computational Analysis with ICM
Computational Analysis with ICMComputational Analysis with ICM
Computational Analysis with ICM
 
00003 Jc Silva 2006 Mcp V5n4p589
00003 Jc Silva 2006 Mcp V5n4p58900003 Jc Silva 2006 Mcp V5n4p589
00003 Jc Silva 2006 Mcp V5n4p589
 
Protein protein interaction basic
Protein protein interaction basicProtein protein interaction basic
Protein protein interaction basic
 
Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?Introduction to systems biology – How systems work?
Introduction to systems biology – How systems work?
 
NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw NetBioSIG2013-Talk Robin Haw
NetBioSIG2013-Talk Robin Haw
 
A Systems Biology Approach to Natural Products Research
A Systems Biology Approach to Natural Products ResearchA Systems Biology Approach to Natural Products Research
A Systems Biology Approach to Natural Products Research
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
protein-protein interaction
protein-protein  interactionprotein-protein  interaction
protein-protein interaction
 
Internship Report
Internship ReportInternship Report
Internship Report
 
Gene regulatory networks
Gene regulatory networksGene regulatory networks
Gene regulatory networks
 
Introduction to Proteogenomics
Introduction to Proteogenomics Introduction to Proteogenomics
Introduction to Proteogenomics
 
Slides 0
Slides 0Slides 0
Slides 0
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 

Andere mochten auch

Product optimization by scheduling formulation & optimal storage strategy...
Product optimization by scheduling formulation & optimal storage strategy...Product optimization by scheduling formulation & optimal storage strategy...
Product optimization by scheduling formulation & optimal storage strategy...
eSAT Journals
 
Bukovec_Melanie_ResearchPaper
Bukovec_Melanie_ResearchPaperBukovec_Melanie_ResearchPaper
Bukovec_Melanie_ResearchPaper
Melanie Bukovec
 
GAMS_Flyer(2)
GAMS_Flyer(2)GAMS_Flyer(2)
GAMS_Flyer(2)
J Milani
 
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
Neelima Sharma
 
Bioprocess development for enhanced spore production in shake flask and pilot...
Bioprocess development for enhanced spore production in shake flask and pilot...Bioprocess development for enhanced spore production in shake flask and pilot...
Bioprocess development for enhanced spore production in shake flask and pilot...
iosrjce
 
A comparison of the cell walls gram-positive and gram-negative medical imag...
A comparison of the cell walls   gram-positive and gram-negative medical imag...A comparison of the cell walls   gram-positive and gram-negative medical imag...
A comparison of the cell walls gram-positive and gram-negative medical imag...
Medical_PPT_Images
 
Lab 6 isolation of antibiotic producer from soil
Lab 6 isolation of antibiotic producer from soilLab 6 isolation of antibiotic producer from soil
Lab 6 isolation of antibiotic producer from soil
Hama Nabaz
 

Andere mochten auch (20)

Areejit samal fba_tutorial
Areejit samal fba_tutorialAreejit samal fba_tutorial
Areejit samal fba_tutorial
 
Product optimization by scheduling formulation & optimal storage strategy...
Product optimization by scheduling formulation & optimal storage strategy...Product optimization by scheduling formulation & optimal storage strategy...
Product optimization by scheduling formulation & optimal storage strategy...
 
Bukovec_Melanie_ResearchPaper
Bukovec_Melanie_ResearchPaperBukovec_Melanie_ResearchPaper
Bukovec_Melanie_ResearchPaper
 
GAMS_Flyer(2)
GAMS_Flyer(2)GAMS_Flyer(2)
GAMS_Flyer(2)
 
Improving strain design for biotechnology with constraint-based modelling.
Improving strain design for biotechnology with constraint-based modelling.Improving strain design for biotechnology with constraint-based modelling.
Improving strain design for biotechnology with constraint-based modelling.
 
Signal Processing Approach for Recognizing Identical Reads From DNA Sequencin...
Signal Processing Approach for Recognizing Identical Reads From DNA Sequencin...Signal Processing Approach for Recognizing Identical Reads From DNA Sequencin...
Signal Processing Approach for Recognizing Identical Reads From DNA Sequencin...
 
Introduction of GAMS Software
Introduction of GAMS SoftwareIntroduction of GAMS Software
Introduction of GAMS Software
 
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
Cloning in gram positive bacteria by neelima sharma,neelima.sharma60@gmail.co...
 
Bioprocess development for enhanced spore production in shake flask and pilot...
Bioprocess development for enhanced spore production in shake flask and pilot...Bioprocess development for enhanced spore production in shake flask and pilot...
Bioprocess development for enhanced spore production in shake flask and pilot...
 
A comparison of the cell walls gram-positive and gram-negative medical imag...
A comparison of the cell walls   gram-positive and gram-negative medical imag...A comparison of the cell walls   gram-positive and gram-negative medical imag...
A comparison of the cell walls gram-positive and gram-negative medical imag...
 
Materials Flux Analysis @myassignmenthelp.net
Materials Flux Analysis  @myassignmenthelp.netMaterials Flux Analysis  @myassignmenthelp.net
Materials Flux Analysis @myassignmenthelp.net
 
Detection of Non-Halal Ingredients_2014
Detection of Non-Halal Ingredients_2014Detection of Non-Halal Ingredients_2014
Detection of Non-Halal Ingredients_2014
 
Introduction to plant design economics
Introduction to plant design economicsIntroduction to plant design economics
Introduction to plant design economics
 
MOdelling CGE in GAMS
MOdelling CGE in GAMSMOdelling CGE in GAMS
MOdelling CGE in GAMS
 
Analysis of E. coli ancestral metabolic networks reveals past adaption - Tin ...
Analysis of E. coli ancestral metabolic networks reveals past adaption - Tin ...Analysis of E. coli ancestral metabolic networks reveals past adaption - Tin ...
Analysis of E. coli ancestral metabolic networks reveals past adaption - Tin ...
 
Global optimization
Global optimizationGlobal optimization
Global optimization
 
Ungs2050: ETHICS AND FIQH FOR EVERYDAY LIFE
Ungs2050: ETHICS AND FIQH FOR EVERYDAY LIFEUngs2050: ETHICS AND FIQH FOR EVERYDAY LIFE
Ungs2050: ETHICS AND FIQH FOR EVERYDAY LIFE
 
Lab soil
Lab soilLab soil
Lab soil
 
GAMS 2015 Vienna
GAMS 2015 ViennaGAMS 2015 Vienna
GAMS 2015 Vienna
 
Lab 6 isolation of antibiotic producer from soil
Lab 6 isolation of antibiotic producer from soilLab 6 isolation of antibiotic producer from soil
Lab 6 isolation of antibiotic producer from soil
 

Ähnlich wie Areejit Samal Emergence Alaska 2013

Bio process
Bio processBio process
Bio process
sun777
 
Bio process
Bio processBio process
Bio process
sun777
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
jaumebp
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...
Samuel Sattath
 

Ähnlich wie Areejit Samal Emergence Alaska 2013 (20)

Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputing
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein Networks
 
Huwang-2-7.ppt
Huwang-2-7.pptHuwang-2-7.ppt
Huwang-2-7.ppt
 
A tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus KaiserA tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus Kaiser
 
Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network Medicine
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based AlgorithmsNMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
 
Flux balance analysis
Flux balance analysisFlux balance analysis
Flux balance analysis
 
Modelling Functional Motions of Biological Systems by Customised Natural Moves.
Modelling Functional Motions of Biological Systems by Customised Natural Moves.Modelling Functional Motions of Biological Systems by Customised Natural Moves.
Modelling Functional Motions of Biological Systems by Customised Natural Moves.
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
 
Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Bio process
Bio processBio process
Bio process
 
Bio process
Bio processBio process
Bio process
 
Dynamic complex formation during the yeast cell cycle
Dynamic complex formation during the yeast cell cycleDynamic complex formation during the yeast cell cycle
Dynamic complex formation during the yeast cell cycle
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...
 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Areejit Samal Emergence Alaska 2013

  • 1. Areejit Samal International Centre for Theoretical Physics Trieste, ITALY asamal@ictp.it Phenotypic constraints drive the architecture of biological networks Work done @ CNRS, LPTMS, Univ Paris-Sud 11, Orsay, France & Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
  • 2. Boehringer Mannheim Biochemical Pathway Chart Metabolic network Metabolic network consists the set of biochemical reactions that convert nutrient molecules into key molecules required for growth and maintenance of the cell.
  • 3. Large-scale structure of metabolic networks is similar to other real world networks Ma and Zeng (2003); Csete and Doyle (2004) Wagner & Fell (2001) Jeong et al (2000) Ravasz et al (2002) Small-world, Scale-free and Hierarchical Organization Bow-tie architecture Bow-tie architecture of the metabolism is similar to that found in the WWW by Broder et al. Small Average Path Length, High Local Clustering, Power-law degree distribution Directed graph with a giant component of strongly connected nodes along with associated IN and OUT component.  Structure of real networks is very different from random (Erdős–Rényi) networks.  Large-scale structure of biological networks is similar to other kinds of real networks (social and technological networks).
  • 4. Reference: Shen-Orr et al (2002); Milo et al (2002,2004) Local structure of biological networks: Network motifs Sub-graphs that are over-represented in real networks compared to randomized networks. Certain network motifs were shown to perform important information processing tasks. E.g., the coherent Feed Forward Loop (FFL) can be used to detect persistent signals. The preponderance of certain sub-graphs in the real network may reflect evolutionary design principles favored by selection.
  • 5. Are the observed structural features of a metabolic network ‘unusual’ or ‘atypical’?  Extremely popular to study network measures such as degree distribution, clustering coefficient, assortativity, motifs, etc. in biological networks.  But mere observation of scale-free, small-world or other striking structural features in biological networks do not mean these properties are non-obvious or atypical features of cellular networks.  A distinction needs to be made between observation of these structural properties in real networks and our understanding of the generative principles that may have led to these properties in real networks.
  • 6. Questions of Interest  Are the observed structural properties of biological networks frozen accidents?  What are the adaptive features of a network? Which features of a biological network are under selection and which are mere byproducts of selection on other traits? To answer these questions adequately, one needs proper controls or null models of biological networks.  Current controls are not well posed for biological networks, especially, metabolic networks.  Furthermore, proper controls should account for biological function. What are the non-obvious or atypical properties of metabolic networks once biological function is accounted in the control or null model?
  • 7. Is the large-scale structure of E. coli metabolic network atypical? • Metabolite Degree distribution • Clustering Coefficient • Average Path Length • Pc: Probability that a path exists between two nodes in the directed graph • Largest strongly component (LSC) and the union of LSC, IN and OUT components We decided to compare the following structural properties of E. coli metabolic network with those in randomized metabolic networks: Bow-tie architecture of the Internet and metabolism Ref: Broder et al (1999); Ma and Zeng (2003) Scale Free and Small World Ref: Jeong et al (2000); Wagner & Fell (2001) The widely used null model to generate randomized metabolic networks is not well posed to answer this question!
  • 8. Chosen Property 0.60 0.62 0.64 0.66 0.68 0.70 Frequency 0 50 100 150 200 250 300 Edge-randomization: Widely-used null model that does not account for biological function 1) Measure the ‘chosen property’ in the investigated real network. 2) Generate randomized networks with structure similar to the investigated real network using edge-randomization. 3) Use the distribution of the ‘chosen property’ for the randomized networks to estimate a p-value. Investigated network p-value Investigated Network After 1 exchange After 2 exchanges Reference: Milo et al (2002,2004); Maslov & Sneppen (2002) BUT this null model does not account for phenotypic or functional constraints central to cellular networks!
  • 9. Edge-randomization: Does not preserve mass balance constraint and unsuitable for metabolic networks ASPT CITL fum asp-L cit ac oaanh4 ASPT* CITL* fum asp-L cit ac oaanh4 ASPT: asp-L → fum + nh4 CITL: cit → oaa + ac ASPT*: asp-L → ac + nh4 CITL*: cit → oaa + fum Preserves degree of each node in the network but generates fictitious reactions that violate mass, charge and atomic balance satisfied by real chemical reactions!! Note that fum has 4 carbon atoms while ac has 2 carbon atoms in the example shown. Null-model used in many studies including: Guimera & Amaral Nature (2005) Biochemically meaningless randomization inappropriate for metabolic networks We decided to develop a proper null model for metabolic networks that accounts for basic biochemical and functional constraints.
  • 10. Global reaction set ensures mass balance constraint Limitation 1: Edge-randomization generates randomized reactions which violate mass and atomic balance. Solution: We decided to overcome this problem by limiting the set of reactions in random metabolic networks to those within KEGG database. KEGG Database + E. coli iJR904 We have use a curated database of 5870 mass balanced reactions derived from KEGG by Rodrigues and Wagner (2009).
  • 11. Bit string representation of metabolic network 6000 900C KEGG Database E. coli R1: 3pg + nad → 3php + h + nadh R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o . . . RN: --------------------------------- 1 0 1 . . . . The E. coli metabolic network or any random network of reactions within KEGG can be represented as a bit string of length N = 5870 reactions with exactly n entries equal to 1 where n is the number of reactions in E. coli. Present - 1 Absent - 0 N = 5870 reactions within KEGG (or global reaction set) Contains n reactions (1‟s in the bit string)
  • 12. Constraint-based Flux Balance Analysis (FBA): Determine the functional capability of metabolic networks List of metabolic reactions with stoichiometric coefficients Biomass composition Set of nutrients in the environment Flux Balance Analysis (FBA) Growth rate for the given medium ‘Biomass Yield’ Fluxes of all reactions Advantages FBA does not require enzyme kinetic information which is not known for most reactions. Disadvantages FBA cannot predict internal metabolite concentrations and is restricted to steady states. Basic models do not account for metabolic regulation. Reference: Varma and Palsson, Biotechnology (1994); Price et al (2004) Input Genotype: Set of Metabolic Reactions Environment: Set of nutrients available for uptake Output Phenotype: Ability to grow
  • 13. Metabolic Genotype to Phenotype Map 0 1 1 . . . . 0 0 1 . . . . GA GB x Deleted reaction Flux Balance Analysis (FBA) Viable genotype Unviable genotype Bit string and equivalent network representation of genotypes Metabolic Genotypes Ability to synthesize Viability biomass components Growth No Growth Limitation 2: Edge-randomization does not account for biological function. Solution: Flux Balance Analysis can be used to determine functionality of metabolic networks.
  • 14. Fraction of networks satisfying functional constraints Number of reactions in genotype 2000 2200 2400 2600 2800 Fractionofgenotypesthatareviable 100 10-5 10-10 10-15 10-20 Glucose Acetate Succinate Analytic prefactor Answer: To estimate this fraction: • Generate random networks within KEGG with exactly n reactions. • Determine the fraction of networks that can grow under glucose minimal media using FBA. Question: What fraction of possible metabolic networks within KEGG with exactly n reactions can grow under glucose minimal media like E. coli? Only a tiny fraction of possible networks satisfy functional constraints yet the number of functional networks is huge ! Reference: Samal et al., BMC Systems Biology (2010) Space of possible networks with number of reactions equal to E. coli ~ 101113 Space of possible networks with number of reactions equal to E. coli and desired phenotype ~ 101050
  • 15. Necessity for Markov Chain Monte Carlo (MCMC) Sampling Space of possible networks ~ 101113 Space of networks with desired phenotype ~ 101050 We wish to uniformly sample the space of viable metabolic networks within KEGG having the same number of reactions as E. coli. But one cannot sample the viable space by generating random bit strings followed by a test for phenotype. We resort to Markov Chain Monte Carlo (MCMC) method to sample the space of networks satisfying functional constraints.
  • 16. MCMC sampling of metabolic networks with growth phenotype Accept/Reject Criterion of reaction swap: Accept if (a) No. of metabolites in the new network is less than or equal to E. coli (b) New network is able to grow on the specified environment(s) KEGG Database R1: 3pg + nad → 3php + h + nadh R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o . . . RN: --------------------------------- 1 0 1 0 1 . . N = 5870 reactions within KEGG reaction universe 1 1 0 0 1 . . 1 0 1 1 0 . . 1 1 1 0 0 . . 0 0 1 0 1 . . E. coli Accept Reject Step 1 Swap Swap Reject 105 Swaps Step 2 103 Swaps store store Erase memory of initial network Saving Frequency A reaction swap can be decomposed into two mutations: a) First mutation leading to deletion of reaction b) Second mutation leading to addition of reaction A reaction swap preserves the number of reactions in the randomized network.
  • 17. Randomizing metabolic networks We have developed a new method using Markov Chain Monte Carlo (MCMC) sampling and Flux Balance Analysis (FBA) to generate meaningful randomized ensembles for metabolic networks by successively imposing constraints. Ensemble R Given no. of valid biochemical reactions Ensemble RM Fix the no. of metabolites Ensemble uRM Exclude Blocked Reactions Ensemble uRM-V1 Functional constraint of growth on specified environments Increasinglevelofconstraints Constraint I Constraint II Constraint III Constraint IV + + + Reference: Samal et al., BMC Systems Biology (2010); Samal and Martin, PLoS ONE (2011)
  • 18. Metabolite degree distribution k 1 10 100 1000 P(k) 10-6 10-5 10-4 10-3 10-2 10-1 100 E. coli Random Sampled random networks are scale-free like E. coli ! Degree of a metabolite is the number of reactions in which the metabolite participates in the network.
  • 19. Clustering coefficient and Average Path Length Sampled random networks have small-world and hierarchical architecture! R: Fixed no. of reactions RM: Fixed no. of reactions and metabolites uRM: Fixed no. of unblocked reactions and metabolites uRM-V1: Fixed no. of unblocked reactions and metabolites and viable in one environment uRM-V5: Fixed no. of unblocked reactions and metabolites and viable in five environments uRM-V10: Fixed no. of unblocked reactions and metabolites and viable in ten environments
  • 20. Probability that a path exists between two nodes and Size of largest strong component In a directed network, the may exist path from node a to f but lack a path back from node f to a. Probability that a path exists between two nodes is an important quantity characterizing directed networks. A strongly connected component is a maximal set of nodes such that for any pair of nodes a and b in the set there is a directed path from a to b and from b to a. The size of largest strong component is an important characteristic of directed networks. Sampled random networks have a bow-tie architecture similar to real network!
  • 21. Global structural properties of real metabolic networks are a consequence of simple biochemical and functional constraints Increasinglevelofconstraints Reference: Samal and Martin, PLoS ONE (2011) Conjectured by: A. Wagner (2007) and B. Papp et al. (2008) Biological function is the main driver of large-scale structure in metabolic networks.
  • 22. Additional constraints reduce the space of possible networks Reference: Samal et al, BMC Systems Biology (2010) Ensemble R Ensemble Rm Ensemble uRm Ensemble uRm-V1 Ensemble uRm-V5 Ensemble uRm-V10 Embedded sets like Russian dolls Including the viability constraint on the first chemical environment leads to a reduction by at least a factor 1050
  • 23. High level of genetic diversity in our randomized ensembles Any two random networks in our most constrained ensemble uRM-v10 differ in ~ 60% of their reactions. 1 0 1 . . . . 1 0 0 . . . . E. coli Random network in ensemble uRM-V10 versus Hamming distance between the two networks is ~ 60% of the maximum possible between two bit strings. 1 0 1 . . . . 1 0 0 . . . . Random network in ensemble uRM-V10 versus Random network in ensemble uRM-V10
  • 24. Origins of Power law degree distribution: gene duplication and divergence Reference: Barabasi and Oltvai (2004) In Barabasi and Albert model, scale-free networks emerge through two basic mechanisms: growth and preferential attachment. It has been suggested that a combination of gene duplication and divergence can explain power laws observed in protein interaction networks and this line of argument has been extended to explain power laws in metabolic networks.
  • 25. Reaction Degree Distribution Number of substrates in a reaction 0 2 4 6 8 10 12 Frequency 0 500 1000 1500 2000 2500 3000 0 Currency 1 Currency 2 Currency 3 Currency 4 Currency 5 Currency 6 Currency 7 Currency In each reaction, we have counted the number of currency and other metabolites. Most reactions involve 4 metabolites of which 2 are currency metabolites and 2 are other metabolites. The nature of reaction degree distribution is very different from the metabolite degree distribution which follows a power law. The degree of a reaction is the number of metabolites that participate in it. The reaction degree distribution is bell shaped with typical reaction in the network involving 4 metabolites. R atp A adp B
  • 26. Scale-free versus Scale-rich network: Origin of Power laws in metabolic networks Tanaka and Doyle have suggested a classification of metabolites into three categories based on their biochemical roles (a) „Carriers‟ (very high degree) (b) „Precursors‟ (intermediate degree) (c) „Others‟ (low degree) Gene duplication and divergence at the level of protein interaction networks has been suggested as an explanation for observed power laws in metabolic networks. However, the presence of ubiquitous (high degree) „currency‟ metabolites along with low degree „other‟ metabolites in each reaction can alternatively explain the observed power laws (or at least fat tail) in the metabolite degree distribution. Reference: Tanaka (2005); Tanaka, Csete, and Doyle (2008) R atp A adp B
  • 27. MCMC sampling method: A computational framework to address questions in evolutionary systems biology Reference: Papp et al (2008)
  • 28. Genotype networks: A many-to-one genotype to phenotype map Genotype Phenotype RNA Sequence Secondary structure of pairings folding Reference: Fontana & Schuster (1998) Lipman & Wilbur (1991) Ciliberti, Martin & Wagner (2007) In the context of RNA, genotype networks are commonly referred to as Neutral networks. We have studied the properties of the metabolic genotype-phenotype map using our framework in detail. Protein Sequence Three dimensional structure Gene network Gene expression levels Reference: Samal et al, BMC Systems Biology (2010)
  • 29. Implications: Origins of Evolutionary Innovations  “How do organisms maintain existing phenotype while exploring for new and better adapted phenotypes?”  Darwin‟s theory explains the survival of the fittest but does not explain the arrival of the fittest.  Our MCMC simulations show that: "Starting with the reference genotype, one can evolve to very different genotypes with gradual small changes while maintaining the existing phenotype“.  Neighborhoods of different genotypes with a given phenotype can have access to very different novel phenotypes.
  • 30. Properties of Metabolic Genotype to Phenotype map  Any one phenotype is adopted by a vast number of genotypes.  These genotypes form large connected sets in genotype space and it is possible to reach any genotype in such a connected set from any other genotype by a series of small genotypic changes.  Any one genotype network typically extends far through genotype space.  Individual genotypes on any one genotype network typically have multiple neighbors with the same phenotype. Put differently, phenotypes are to some extent robust to small changes in genotypes (such as mutations).  Different neighborhoods of the same genotype network contain very different novel phenotypes.
  • 31. Floral Organ Specification (FOS) gene regulatory network We have developed a Markov Chain Monte Carlo (MCMC) method similar to the case of metabolic networks to sample the space of functional gene regulatory networks. Phenotype: 10 steady-state attractors of the network whose expression states specify different cell types (shoot apical meristem, sepal, petal, stamen and carpel). Genotype: Boolean Gene Regulatory Network model for Arabidopsis Floral Organ Specification network containing 15 genes connected by 46 edges. Reference: Alvarez-Buylla et al, The Arabidopsis Book (2010)
  • 32. Edge usage in real and sampled networks with functional constraints Reference: Henry, Moneger, Samal* and Martin*, Mol. Biosystems (2013) Arabidopsis network Sampled networks with phenotype constraints
  • 33. Summary  We have proposed a null model based on Markov Chain Monte Carlo (MCMC) sampling to generate benchmark ensembles with desired phenotype for metabolic networks.  Our realistic benchmark ensembles can be used to distinguish between 'typical' and 'atypical' properties of a network.  We show that many large-scale structural properties of metabolic networks are by-products of functional or phenotypic constraints.  Modularity in metabolic networks can arise due to phenotypic constraints of growth in many different environments.  Our framework can be used to address many questions in evolutionary systems biology.
  • 34. Acknowledgements Andreas Wagner João Rodrigues Jürgen Jost Olivier Martin Adrien Henry Françoise Monéger