SlideShare ist ein Scribd-Unternehmen logo
1 von 59
Downloaden Sie, um offline zu lesen
Large-scale structure of metabolic networks:
    Role of Biochemical and Functional
                constraints




                                Areejit Samal
  Laboratoire de Physique Théorique et Modèles Statistiques (LPTMS), CNRS and Univ
                                Paris-Sud, Orsay, France
                                          and
        Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany


                                 Areejit Samal
Outline

 Architectural features of metabolic networks

 Issues with existing null models for metabolic networks

 Our method to generate realistic randomized metabolic
  networks using Markov Chain Monte Carlo (MCMC)
  sampling and Flux Balance Analysis (FBA)

 Biological function is a main driver of large-scale structure in
  metabolic networks




                           Areejit Samal
Architectural features of real-world networks

        Small-world                                    Scale-free




Watts and Strogatz, Nature (1998)           Barabasi and Albert, Science (1999)
Small Average Path Length                    Power law degree distribution
 & High Local Clustering




Scale-free graphs can be generated using a simple model incorporating:
Growth & Preferential attachment scheme

                                    Areejit Samal
Architectural features of real-world networks

        Small-world                                    Scale-free




Watts and Strogatz, Nature (1998)           Barabasi and Albert, Science (1999)
Small Average Path Length                    Power law degree distribution
 & High Local Clustering




Scale-free graphs can be generated using a simple model incorporating:
Growth & Preferential attachment scheme

                                    Areejit Samal
Large-scale structure of metabolic networks

         Small-world, Scale-free and Hierarchical Organization




     Wagner & Fell (2001)             Jeong et al (2000)        Ravasz et al (2002)
Small Average Path Length, High Local Clustering, Power-law degree distribution

                                Bow-tie architecture

                                     Directed graph with a giant component of
                                     strongly connected nodes along with associated
                                     IN and OUT component

                                     Bow-tie architecture of the metabolism is similar
                                     to that found for the WWW by Broder et al.

Ma and Zeng, Bioinformatics (2003)

                                     Areejit Samal
Large-scale structure of metabolic networks

         Small-world, Scale-free and Hierarchical Organization




     Wagner & Fell (2001)             Jeong et al (2000)        Ravasz et al (2002)
Small Average Path Length, High Local Clustering, Power-law degree distribution

                                Bow-tie architecture

                                     Directed graph with a giant component of
                                     strongly connected nodes along with associated
                                     IN and OUT component

                                     Bow-tie architecture of the metabolism is similar
                                     to that found for the WWW by Broder et al.

Ma and Zeng, Bioinformatics (2003)

                                     Areejit Samal
Network motifs and Evolutionary design


                                                 Sub-graphs that are over-
                                                 represented in real networks
                                                 compared to randomized
                                                 networks with same degree
                                                 sequence.



Certain network motifs were shown to
perform important information processing
tasks.


For example, the coherent Feed Forward
Loop (FFL) can be used to detect
persistent signals.

                          Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004)

                                Areejit Samal
Network motifs and Evolutionary design


                                                 Sub-graphs that are over-
                                                 represented in real networks
                                                 compared to randomized
                                                 networks with same degree
                                                 sequence.



Certain network motifs were shown to
perform important information processing
tasks.


For example, the coherent Feed Forward
Loop (FFL) can be used to detect
persistent signals.

                          Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004)

                                Areejit Samal
“Null hypothesis” based approach for identifying design
             principles in real-world networks

Statistically significant network properties are
deduced as follows:
                                                          300




1) Measure the ‘chosen property’ (E.g.,                   250
                                                                                    Investigated
   motif significance profile, average                                              network
                                                          200

   clustering coefficient, etc.) in the




                                              Frequency
                                                                                         p-value
   investigated real network.                             150



                                                          100


2) Generate randomized networks with          50

   structure similar to the investigated real
   network using an appropriate null           0
                                                0.60            0.62   0.64       0.66     0.68    0.70

   model.                                                              Chosen Property




3) Use the distribution of the ‘chosen
   property’ for the randomized networks
   to estimate a p-value.


                                     Areejit Samal
Edge-exchange algorithm: commonly-used null model


                                                                     Randomized networks
                                                                     preserve the degree at
                                                                   each node and the degree
                                                                           sequence.


   Investigated Network




                                  After 1 exchange




                                                           After 2 exchanges


Reference:
Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004); Maslov & Sneppen Science (2002)

                                          Areejit Samal
Carefully posed null models are required for deducing
          design principles with confidence


                                     C. elegans Neuronal Network:
                                     Close by neurons have a greater
                                     chance of forming a connection
                                     than distant neurons




                                     A null model incorporating this
                                     spatial constraint gives different
                                     results compared to generic edge-
                                     exchange algorithm.




                                     Artzy-Randrup et al Science (2004)

                     Areejit Samal
Carefully posed null models are required for deducing
          design principles with confidence


                                     C. elegans Neuronal Network:
                                     Close by neurons have a greater
                                     chance of forming a connection
                                     than distant neurons




                                     A null model incorporating this
                                     spatial constraint gives different
                                     results compared to generic edge-
                                     exchange algorithm.




                                     Artzy-Randrup et al Science (2004)

                     Areejit Samal
Edge-exchange algorithm: case of bipartite metabolic
                          networks

        asp-L      cit




        ASPT      CITL




nh4     fum        ac     oaa


ASPT:      asp-L → fum + nh4
CITL:      cit → oaa + ac




                                Areejit Samal
Edge-exchange algorithm: case of bipartite metabolic
                          networks
                   cit                        asp-L        cit      Null-model used in
        asp-L
                                                                    many studies
                                                                    including: Guimera &
                                                                    Amaral Nature (2005);
                  CITL                        ASPT*       CITL*     Samal et al BMC
        ASPT
                                                                    Bioinformatics (2006)



                   ac     oaa          nh4     fum         ac     oaa
nh4     fum


ASPT:      asp-L → fum + nh4             ASPT*:      asp-L → ac + nh4
CITL:      cit → oaa + ac                CITL*:      cit → oaa + fum

Preserves degree of each node in the network
But generates fictitious reactions that violate
mass, charge and atomic balance satisfied by
real chemical reactions!!
Note that fum has 4 carbon atoms while ac has
2 carbon atoms in the example shown.

                                   Areejit Samal
Edge-exchange algorithm: case of bipartite metabolic
                          networks
                   cit                        asp-L        cit      Null-model used in
        asp-L
                                                                    many studies
                                                                    including: Guimera &
                                                                    Amaral Nature (2005);
                  CITL                        ASPT*       CITL*     Samal et al BMC
        ASPT
                                                                    Bioinformatics (2006)



                   ac     oaa          nh4     fum         ac     oaa
nh4     fum


ASPT:      asp-L → fum + nh4             ASPT*:      asp-L → ac + nh4
CITL:      cit → oaa + ac                CITL*:      cit → oaa + fum

Preserves degree of each node in the network
But generates fictitious reactions that violate
mass, charge and atomic balance satisfied by
real chemical reactions!!
Note that fum has 4 carbon atoms while ac has
2 carbon atoms in the example shown.

                                   Areejit Samal
Edge-exchange algorithm: case of bipartite metabolic
                          networks
                   cit                        asp-L        cit      Null-model used in
        asp-L
                                                                    many studies
                                                                    including: Guimera &
                                                                    Amaral Nature (2005);
                  CITL                        ASPT*        CITL*    Samal et al BMC
        ASPT
                                                                    Bioinformatics (2006)



                   ac     oaa          nh4     fum         ac      oaa
nh4     fum


ASPT:      asp-L → fum + nh4             ASPT*:      asp-L → ac + nh4
CITL:      cit → oaa + ac                CITL*:      cit → oaa + fum

Preserves degree of each node in the network
But generates fictitious reactions that violate       A common generic null model
mass, charge and atomic balance satisfied by          (edge exchange algorithm) is
real chemical reactions!!                             unsuitable for different types
                                                      of real-world networks.
Note that fum has 4 carbon atoms while ac has
2 carbon atoms in the example shown.

                                   Areejit Samal
Null models for metabolic networks: Blind-watchmaker network




                       Areejit Samal
Null models for metabolic networks: Blind-watchmaker network




                       Areejit Samal
Randomizing metabolic networks

We propose a new method using Markov Chain Monte Carlo (MCMC) sampling and
Flux Balance Analysis (FBA) to generate meaningful randomized ensembles for
metabolic networks by successively imposing biochemical and functional constraints.
     Ensemble R




                                                                                      Increasing level of constraints
   Given no. of valid                                                Constraint I
 biochemical reactions

    +            Ensemble Rm                                         Constraint II
           Fix the no. of metabolites


             +            Ensemble uRm                               Constraint III
                     Exclude Blocked Reactions


                          +             Ensemble uRm-V1
                                      Functional constraint of
                                                                     Constraint IV

                                  growth on specified environments



                                        Areejit Samal
Comparison with E. coli: Large-scale structural properties
                       of interest
In this work, we will generate randomized metabolic networks
using only reactions in KEGG and compare the structural
properties of randomized networks with those of E. coli.

We study the following large-scale structural properties:

•   Metabolite Degree distribution

•   Clustering Coefficient

•   Average Path Length                                                             Scale Free and Small World
                                                                   Ref: Jeong et al (2000) Wagner & Fell (2001)

•   Pc: Probability that a path exists between two nodes in
    the directed graph

•   Largest strongly component (LSC) and the union of
    LSC, IN and OUT components

                                                                                    Bow-tie architecture of the
    E. coli metabolic network has m (~750) metabolites                            Internet and metabolism
    and r (~900) reactions.                          Ref: Broder et al; Ma and Zeng, Bioinformatics (2003)



                                           Areejit Samal
glc-D                          HEX1: atp + glc-D → adp + g6p + h
                                                                                      atp
                                                                                            PGI: g6p ↔ f6p
Different Graph-theoretic Representations of the
                                                                                            PFK: atp + f6p → adp + fdp + h

                                                            HEX1                                  Currency               Other
                                                                                                  Metabolite             Metabolite


                                                                                                  Irreversible           Reversible
                                                             g6p                                                         Reaction
               Metabolic Network




                                                                                                  Reaction



                                                             PGI
                                                                                                                 glc-D



                                                              f6p
                                                                                                                 g6p



                                                             PFK
                                                                                                                 f6p



                                                   h          fdp            adp
                                                                                                                 fdp

                                                       (a) Bipartite Representation                 (b) Unipartite Representation
glc-D                          HEX1: atp + glc-D → adp + g6p + h
                                                                                      atp
                                                                                            PGI: g6p ↔ f6p
Different Graph-theoretic Representations of the
                                                                                            PFK: atp + f6p → adp + fdp + h

                                                            HEX1                                  Currency             Other
                                                                                                  Metabolite           Metabolite

                                                       Degree Distribution
                                                                                                  Irreversible         Reversible
                                                             g6p                                                       Reaction
               Metabolic Network




                                                                                                  Reaction



                                                             PGI
                                                                                                                glc-D
                                                                                                        Clustering coefficient
                                                                                                        Average Path Length
                                                                                                     Largest Strong Component
                                                              f6p
                                                                                                                 g6p



                                                             PFK
                                                                                                                 f6p



                                                   h          fdp            adp
                                                                                                                 fdp

                                                       (a) Bipartite Representation                 (b) Unipartite Representation
glc-D                          HEX1: atp + glc-D → adp + g6p + h
                                                                                      atp
                                                                                            PGI: g6p ↔ f6p
Different Graph-theoretic Representations of the
                                                                                            PFK: atp + f6p → adp + fdp + h

                                                            HEX1                                  Currency             Other
                                                                                                  Metabolite           Metabolite

                                                       Degree Distribution
                                                                                                  Irreversible         Reversible
                                                             g6p                                                       Reaction
               Metabolic Network




                                                                                                  Reaction



                                                             PGI
                                                                                                                glc-D
                                                                                                        Clustering coefficient
                                                                                                        Average Path Length
                                                                                                     Largest Strong Component
                                                              f6p
                                                                                                                 g6p



                                                        The values obtained for different network measures depend on the list
                                                             PFK
                                                                                                            f6p
                                                        of currency metabolites used to generate the unipartite graph of
                                                        metabolites starting from the bipartite graph. In this work, we have
                                                   h
                                                        usedfdpwell defined biochemically sound list of currency metabolites to
                                                              a           adp
                                                                                                            fdp
                                                        construct unipartite graph corresponding to each metabolic network.
                                                       (a) Bipartite Representation                 (b) Unipartite Representation
Constraint I: Given number of valid biochemical
                    reactions

                 Instead of generating fictitious reactions using
                edge exchange mechanism, we can restrict to
                >6000 valid reactions contained within KEGG
                database.

                 We use the database of 5870 mass balanced
                reactions derived from KEGG by Rodrigues and
                Wagner. For simplicity, I will refer to this database
                in the talk as ‘KEGG’




                   Areejit Samal
Global Reaction Set


 Nutrient metabolites                                       Biomass composition


The set of external                                        The E. coli biomass
metabolites in iJR904                                      composition of iJR904
were allowed for                                           was used to determine
uptake/excretion in the               +                    viability of genotypes.
global reaction set.


                                                           We can implement Flux
                                                           Balance Analysis (FBA)
                                                            within this database.


                          KEGG Database + E. coli iJR904




                                 Areejit Samal
Genome-scale metabolic models: Flux Balance Analysis (FBA)



List of metabolic reactions with
stoichiometric coefficients
                                                                                       Growth rate for
                                                  Flux Balance                         the given medium
Biomass composition                              Analysis (FBA)                        ‘Biomass Yield’

                                                                                       Fluxes of all
                                                                                       reactions
Set of nutrients in the
environment



Advantages
FBA does not require enzyme kinetic information which is not known for most reactions.

Disadvantages
FBA cannot predict internal metabolite concentrations and is restricted to steady states.
Basic models do not account for metabolic regulation.

           Reference: Varma and Palsson, Biotechnology (1994); Price et al Nature Reviews Microbiology (2004)

                                             Areejit Samal
Bit string representation of metabolic networks within the global
                               reaction set


               KEGG Database                              E. coli

               R1: 3pg + nad → 3php + h + nadh              1
                                                                    Present - 1
               R2: 6pgc + nadp → co2 + nadph + ru5p-D       0
               R3: 6pgc → 2ddg6p + h2o                      1       Absent - 0
               .                                            .
               .                                            .
               .                                            .
                                                   6000
                                                 C900
               RN: ---------------------------------        .
                                                                       Contains r reactions
               N = 5870 reactions within KEGG                          (1’s in the bit string)
               (or global reaction set)



The E. coli metabolic network or any random network of reactions within KEGG can be
represented as a bit string of length N with exactly r entries equal to 1.
r is the number of reactions in the E. coli metabolic network.




                                         Areejit Samal
Ensemble R: Random networks within KEGG with same number of
                        reactions as E. coli
Random subsets or networks with r reactions

                                                   To compare with E. coli, we generate an
                                                   ensemble R of random networks as
                               KEGG with
                                 N=5870            follows:
                                reactions
                                                   Pick random subsets with exactly r (~900)
                                                   reactions within KEGG database of N
                                                   (~6000) reactions.
                                                   r is the no. of reactions in E. coli.
       1    0        1     1
       0    0        1     0
       0    1        1     1                       Huge no. of networks possible within
       .    .        .     .                       KEGG with exactly r reactions
       .    .        .     .                          6000
                                                   ~ C900
       .    .        .     .
       .    .        .     .
                                                   Fixing the no. of reactions is equivalent to
  Each random network or bit string                fixing genome size.
  contains exactly r reactions (1’s in
  the string) like in E. coli.



                                            Areejit Samal
Metabolite degree distribution

        A degree of a metabolite is the number of reactions in which the metabolite participates in the
        network.
       100


       10-1


       10-2
P(k)




       10-3


       10-4

                  E. coli
       10-5


       10-6
              1         10       100   1000
                             k


            E. coli: = 2.17;
       Most organisms:  is 2.1-2.3
                  Reference:
               Jeong et al (2000);
              Wagner and Fell (2001)




                                                       Areejit Samal
Metabolite degree distribution

        A degree of a metabolite is the number of reactions in which the metabolite participates in the
        network.
                                                         100
       100

                                                         10-1
       10-1

                                                         10-2
       10-2




                                                  P(k)
P(k)




                                                         10-3
       10-3

                                                         10-4
       10-4
                                                                     E. coli
                  E. coli
                                                         10-5        Random
       10-5

                                                         10-6
       10-6
                                                                1         10       100   1000
              1         10       100   1000
                                                                               k
                             k


            E. coli: = 2.17;                                  Random networks in
       Most organisms:  is 2.1-2.3                             ensemble R: = 2.51
                  Reference:
               Jeong et al (2000);
              Wagner and Fell (2001)




                                                                    Areejit Samal
Metabolite degree distribution

        A degree of a metabolite is the number of reactions in which the metabolite participates in the
        network.
                                                            100                                           100
       100
                                                                                                          10-1
            -1                                              10-1
       10
                                                                                                          10-2
                                                            10-2
       10-2                                                                                               10-3




                                                                                                   P(k)
                                                     P(k)
P(k)




                                                            10-3                                          10-4
       10-3
                                                                                                          10-5
                                                            10-4
       10-4
                                                                        E. coli
                                                                                                          10-6        E. coli
                     E. coli                                                                                          Random
                                                            10-5        Random
       10-5                                                                                               10-7        KEGG


                                                            10-6                                          10-8
       10-6
                                                                   1         10       100   1000                 1       10         100   1000
                 1         10       100   1000
                                                                                  k                                             k
                                k


            E. coli: = 2.17;                                     Random networks in                                Complete KEGG:
       Most organisms:  is 2.1-2.3                                ensemble R: = 2.51                                 = 2.31
                     Reference:
                  Jeong et al (2000);
                 Wagner and Fell (2001)

                 Any (large enough) random subset of reactions in KEGG has Power Law degree
                                               distribution !!

                                                                       Areejit Samal
Reaction Degree Distribution

        The degree of a reaction is the number of metabolites that participate in it.

            3000
                                                   0 Currency
                                                   1 Currency
            2500                                   2 Currency
                                                   3 Currency                atp                  adp
                                                   4 Currency
                                                   5 Currency
            2000
                                                   6 Currency                            R
Frequency




                                                   7 Currency

            1500                                                              A                    B

            1000                                                     In each reaction, we have counted the number of
                                                                     currency and other metabolites which is shown in
            500                                                      the figure.
                                                                     Most reactions involve 4 metabolites of which 2
              0
                   0     2      4      6      8      10         12
                                                                     are currency metabolites and 2 are other
                       Number of substrates in a reaction
                                                                     metabolites.

        The reaction degree distribution is bell shaped with typical reaction in the network
        involving 4 metabolites.
        The nature of reaction degree distribution is very different from the metabolite
        degree distribution which follows a power law.

                                                                     Areejit Samal
Scale-free versus Scale-rich network:
            Origin of Power laws in metabolic networks
Tanaka and Doyle have suggested a classification of metabolites into three
categories based on their biochemical roles
(a) ‘Carriers’ (very high degree)               atp                    adp
(b) ‘Precursors’ (intermediate degree)
                                                             R
(c) ‘Others’ (low degree)
                                                     A                        B


                                           It has been argued that gene duplication
                                           and divergence at the level of protein
                                           interaction networks can lead to
                                           observed power laws in metabolic
                                           networks. However, the presence of
                                           ubiquitous (high degree) ‘currency’
                                           metabolites along with low degree
                                           ‘other’ metabolites in each reaction can
                                           explain the observed fat tail in the
                                           metabolite distribution.

                                  Reference: Tanaka (2005); Tanaka, Csete, and Doyle (2008)

                                   Areejit Samal
Network Properties: Given number of valid biochemical
                      reactions
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Network Properties: Given number of valid biochemical
                      reactions
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Constraint II: Fix the number of metabolites

Direct sampling of randomized networks with same no. of reactions and metabolites as in
E. coli is not possible.

                 Markov Chain Monte Carlo (MCMC) sampling of randomized networks


KEGG Database                            E. coli            Step 1            Step 2                   Ensemble RM

R1: 3pg + nad → 3php + h + nadh            1                  1                 1                     1              0
                                           0       Swap       1      Swap       0      106   Swaps    1 10 Swaps
                                                                                                          3
                                                                                                                     0
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o                    1                  0                 1                     1              1
.                                          0                  0                 1                     0              0
                                                                     Accept
.                                          1                  1                 0                     0              1
                                           .                  .                 .        Erase        .   Saving     .
.                                                                                      memory of
RN: ---------------------------------      .                  .                 .                     . Frequency    .
                                                                                         initial
N = 5870 reactions within KEGG                                                          network
       reaction universe                           Reject            Reject
                                                                                                     store          store
Reaction swap keeps the number of reactions in network constant

Accept/Reject Criterion of reaction swap:
Accept if the no. of metabolites in the new network is less than or equal to E. coli.


                                               Areejit Samal
Network Properties: After fixing number of metabolites
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Network Properties: After fixing number of metabolites
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Constraint III: Exclude Blocked Reactions


                    Large-scale metabolic networks typically have certain
                   dead-end or blocked reactions which cannot contribute to
     KEGG
                   growth of the organism.
Unblocked
  KEGG              Blocked reactions have zero flux for every investigated
                   chemical environment under any steady-state condition and
                   do not contribute to biomass growth flux.

                    We next exclude blocked reactions from our biochemical
                   universe used for sampling randomized networks.




                          For Blocked reactions, see Burgard et al Genome Research (2004)

                              Areejit Samal
MCMC Sampling: Exclusion of blocked reactions

                                                                                                   Ensemble uRM
KEGG Database                            E. coli            Step 1            Step 2

R1: 3pg + nad → 3php + h + nadh            1                  1                 1                   1              0
                                           0       Swap       1      Swap       0      106 Swaps    1 10 Swaps
                                                                                                        3
                                                                                                                   0
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o                    1                  0                 1                   1              1
.                                          0                  0                 1                   0              0
                                                                     Accept              Erase          Saving
.                                          1                  1                 0                   0              1
                                           .                  .                 .      memory of    . Frequency    .
.                                                                                        initial
RN: ---------------------------------      .                  .                 .       network
                                                                                                    .              .
N = 5870 reactions within KEGG
       reaction universe                           Reject            Reject
                                                                                                   store          store



Accept/Reject Criterion of reaction swap:

Restrict the reaction swaps to the set of unblocked reactions within KEGG.

Accept if
the no. of metabolites in the new network is less than or equal to E. coli.



                                               Areejit Samal
Network Properties: After excluding blocked reactions
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Network Properties: After excluding blocked reactions
                                                                                                        10



        0.10
                                                                                                         8


        0.08
                                                                                                         6




                                                                                                  <L>
  <C>




        0.06

                                                                                                         4
        0.04

                                                                                                         2
        0.02


        0.00                                                                                             0
               R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                       Clustering Coefficient                                                                                           Path Length
        0.8
                                                                                                  1.0
                                                                                                                 LSC
                                                                                                                 LSC + IN + OUT


        0.6                                                                                       0.8




                                                                              Fraction of nodes
                                                                                                  0.6
  Pc




        0.4

                                                                                                  0.4

        0.2
                                                                                                  0.2



        0.0                                                                                       0.0
               R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                   Probability that a path exists                                                                     Largest Strong Component
                       between two nodes


                                                                      Areejit Samal
Constraint IV: Growth Phenotype

                                 Definition of Phenotype
Genotypes                            Ability to synthesize        Viability
                                     biomass components
0
1
1
.                                                               Viable
                                                               genotype
.
.                                            Flux                          Growth
.      GA                                  Balance
0                                          Analysis
                                                                          No Growth
0                                           (FBA)
1                                                              Unviable
.                                                              genotype
                          x
.
.                     Deleted

.      GB             reaction



    Bit string and equivalent network
       representation of genotypes


                                                     For FBA, see Price et al (2004) for a review

                                     Areejit Samal
MCMC Sampling: Growth phenotype in one environment

                                                                                                   Ensemble uRM-V1
KEGG Database                            E. coli            Step 1            Step 2

R1: 3pg + nad → 3php + h + nadh            1                  1                 1                   1              0
                                           0       Swap       1      Swap       0      106 Swaps    1 10 Swaps
                                                                                                        3
                                                                                                                   0
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o                    1                  0                 1                   1              1
.                                          0                  0                 1                   0              0
                                                                     Accept
.                                          1                  1                 0        Erase      0   Saving     1
.                                          .                  .                 .      memory of    . Frequency    .
RN: ---------------------------------      .                  .                 .        initial    .              .
                                                                                        network
N = 5870 reactions within KEGG
       reaction universe                           Reject            Reject
                                                                                                   store          store



Accept/Reject Criterion of reaction swap:

Restrict the reaction swaps to the set of unblocked reactions within KEGG.

Accept if
(a) No. of metabolites in the new network is less than or equal to E. coli
(b) New network is able to grow on the specified environment


                                               Areejit Samal
Network Properties: Growth phenotype in one
                      environment
                                                                                                      10



      0.10
                                                                                                       8


      0.08
                                                                                                       6




                                                                                                <L>
<C>




      0.06

                                                                                                       4
      0.04

                                                                                                       2
      0.02


      0.00                                                                                             0
             R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                     Clustering Coefficient                                                                                           Path Length
      0.8
                                                                                                1.0
                                                                                                               LSC
                                                                                                               LSC + IN + OUT


      0.6                                                                                       0.8




                                                                            Fraction of nodes
                                                                                                0.6
Pc




      0.4

                                                                                                0.4

      0.2
                                                                                                0.2



      0.0                                                                                       0.0
             R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                 Probability that a path exists                                                                     Largest Strong Component
                     between two nodes


                                                                    Areejit Samal
Network Properties: Growth phenotype in one
                      environment
                                                                                                      10



      0.10
                                                                                                       8


      0.08
                                                                                                       6




                                                                                                <L>
<C>




      0.06

                                                                                                       4
      0.04

                                                                                                       2
      0.02


      0.00                                                                                             0
             R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                     Clustering Coefficient                                                                                           Path Length
      0.8
                                                                                                1.0
                                                                                                               LSC
                                                                                                               LSC + IN + OUT


      0.6                                                                                       0.8




                                                                            Fraction of nodes
                                                                                                0.6
Pc




      0.4

                                                                                                0.4

      0.2
                                                                                                0.2



      0.0                                                                                       0.0
             R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                 Probability that a path exists                                                                     Largest Strong Component
                     between two nodes


                                                                    Areejit Samal
Network Properties: Growth phenotype in multiple
                 environments
                                                                                                      10



      0.10
                                                                                                       8


      0.08
                                                                                                       6




                                                                                                <L>
<C>




      0.06

                                                                                                       4
      0.04

                                                                                                       2
      0.02


      0.00                                                                                             0
             R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                     Clustering Coefficient                                                                                           Path Length
      0.8
                                                                                                1.0
                                                                                                               LSC
                                                                                                               LSC + IN + OUT


      0.6                                                                                       0.8




                                                                            Fraction of nodes
                                                                                                0.6
Pc




      0.4

                                                                                                0.4

      0.2
                                                                                                0.2



      0.0                                                                                       0.0
             R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                 Probability that a path exists                                                                     Largest Strong Component
                     between two nodes


                                                                    Areejit Samal
Network Properties: Growth phenotype in multiple
                 environments
                                                                                                      10



      0.10
                                                                                                       8


      0.08
                                                                                                       6




                                                                                                <L>
<C>




      0.06

                                                                                                       4
      0.04

                                                                                                       2
      0.02


      0.00                                                                                             0
             R       RM   uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                              R      RM        uRM     uRM-V1 uRM-V5 uRM-V10      E. coli



                     Clustering Coefficient                                                                                           Path Length
      0.8
                                                                                                1.0
                                                                                                               LSC
                                                                                                               LSC + IN + OUT


      0.6                                                                                       0.8




                                                                            Fraction of nodes
                                                                                                0.6
Pc




      0.4

                                                                                                0.4

      0.2
                                                                                                0.2



      0.0                                                                                       0.0
             R      RM    uRM   uRM-V1   uRM-V5 uRM-V10   E. coli                                          R         RM         uRM    uRM-V1   uRM-V5   uRM-V10   E. coli


                 Probability that a path exists                                                                     Largest Strong Component
                     between two nodes


                                                                    Areejit Samal
Global structural properties of real metabolic networks:
Consequence of simple biochemical and functional constraints




                                                       Increasing level of constraints
                                                                                              Conjectured by:
                                                                                         A. Wagner (2007) and
                                                                                          B. Papp et al. (2008)

                                   Samal and Martin, PLoS ONE (2011) 6(7): e22295

                       Areejit Samal
Global structural properties of real metabolic networks:
Consequence of simple biochemical and functional constraints




                                                       Increasing level of constraints
                                                                                              Conjectured by:
                                                                                         A. Wagner (2007) and
                                                                                          B. Papp et al. (2008)

                                   Samal and Martin, PLoS ONE (2011) 6(7): e22295

                       Areejit Samal
High level of genetic diversity in our randomized ensembles

 Any two random networks in our most constrained ensemble uRM-v10 differ in
 ~ 60% of their reactions.

   1                  1                           1                        1
   0                  0                           0                        0
   1                  0                           1                        0
   .                  .                           .        versus          .
          versus
   .                  .                           .                        .
   .                  .                           .                        .
   .                  .                           .                        .

E. coli       Random network in        Random network in            Random network in
              ensemble uRM-V10         ensemble uRM-V10             ensemble uRM-V10




   Hamming distance between the two networks is ~ 60% of the maximum
                       possible between two bit strings.
   There is a set of ~ 100 reactions which are present in all of our random
                                  networks.

                                  Areejit Samal
Necessity of MCMC sampling

                                                                                                                   Ensemble uRm-V10

                                                                                                                   Ensemble uRm-V5
                                                                                                                   Ensemble uRm-V1
                                                                                                                   Ensemble uRm
                                                                                                                   Ensemble Rm


                                                                                                                   Ensemble R
                                                                                                           Embedded sets like Russian dolls
                                                                                                   100




                                                           Fraction of genotypes that are viable
                                                                                                   10-5



                                                                                                   10-10




Including the viability constraint on the first
                                                                                                   10-15

                                                                                                                                                Glucose

chemical environment leads to a reduction by at                                                    10-20
                                                                                                                                                Acetate
                                                                                                                                                Succinate
                                                                                                                                                Analytic prefactor
least a factor 1022
                                                                                                           2000    2200       2400       2600              2800

      Reference: Samal et al, BMC Systems Biology (2010)                                                          Number of reactions in genotype



                                        Areejit Samal
Summary


       We have proposed a new method based
        on MCMC sampling and FBA to
        generate randomized metabolic
        networks by successively imposing few
        macroscopic biochemical and functional
        constraints into our sampling criteria.

       By imposing a few macroscopic
        constraints into our sampling criteria, we
        were able to approach the properties of
        real metabolic networks.

       We conclude that biological function is a
        main driver of structure in metabolic
        networks.




Areejit Samal
Limitations and Future Outlook

 We can extend the study using the SEED database to many organisms.

                                            Automated reconstruction of
                                            metabolic models for >140 organisms
                                            on which FBA can be performed.


 The list of reactions in KEGG is incomplete and its use may reflect evolutionary
  constraints associated with natural organisms.
 From a synthetic biology context, there could be many more reactions that can
  occur which are not in KEGG.
 Alternate approach proposed by Basler et al, Bioinformatics, 2011 which
  generates new mass-balanced reactions but does not impose functionality
  constraint.
                                                         Generate new mass-balanced
                                                         reactions but no constraint of
                                                         biomass production.



                                   Areejit Samal
Genotype networks: A many-to-one genotype to phenotype map


                   RNA Sequence                      Protein Sequence              Gene network
Genotype
                              folding




Phenotype



               Secondary structure                   Three dimensional            Gene expression
                   of pairings                           structure                    levels

Reference:      Fontana & Schuster (1998)            Lipman & Wilbur (1991)   Ciliberti, Martin & Wagner (2007)


Genotype network refers to the set of (many) genotypes that have a given phenotype,
and their “organization” in genotype space. In the context of RNA, genotype networks
are commonly referred to as Neutral networks.


                                            Areejit Samal
Properties of the genotype-to-phenotype map

 Are there many genotypes that have a given
phenotype?

 Do the genotypes with a given phenotype form a
significant fraction of all possible genotypes?

 Are all the genotypes with a given phenotype rather
similar?

 Are biological genotypes atypically robust
compared to random genotypes with similar
phenotype?

 Are biological genotypes atypically innovative?

 Can genotypes change gradually in response to
selective pressures?
                                                    A. Wagner, Nat. Rev. Genet. (2008)

                                 Areejit Samal
Properties of the genotype-to-phenotype map

 Are there many genotypes that have a given
phenotype?

 Do the genotypes with a given phenotype form a
significant fraction of all possible genotypes?
        We have considered these questions in the context of metabolic
 Are all the genotypes with a given phenotype rather
similar?                             networks.

 Are biological genotypes atypically robust enzymes or reactions
              Genotype = lists of metabolic
compared to random genotypes with similar in a given environment
               Phenotype = growth or not
phenotype?

 Are biological genotypes atypically innovative?

 Can genotypes change gradually in response to
selective pressures?
                                                    A. Wagner, Nat. Rev. Genet. (2008)

                                 Areejit Samal
Computational tool for Evolutionary Systems Biology




                    Areejit Samal
Acknowledgement
                                                       Reference




   Olivier C. Martin
  Université Paris-Sud
                                                       Other Work




  Andreas Wagner      João Rodrigues                                Jürgen Jost
          University of Zurich                  Max Planck Institute for Mathematics in the Sciences
Genotype networks in metabolic reaction spaces
Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner*
BMC Systems Biology 2010, 4:30

Environmental versatility promotes modularity in genome-scale metabolic networks
Areejit Samal, Andreas Wagner* and Olivier C Martin*
BMC Systems Biology 2011, 5:135
                                                       Funding

                                   CNRS & Max Planck Society Joint Program in Systems Biology (CNRS MPG
                                   GDRE 513) for a Postdoctoral Fellowship.

                                                        Areejit Samal

Weitere ähnliche Inhalte

Was ist angesagt?

Brain Networks
Brain NetworksBrain Networks
Brain NetworksJimmy Lu
 
Evaluation of deep neural network architectures in the identification of bone...
Evaluation of deep neural network architectures in the identification of bone...Evaluation of deep neural network architectures in the identification of bone...
Evaluation of deep neural network architectures in the identification of bone...TELKOMNIKA JOURNAL
 
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...Alexander Decker
 
5.a robust frame of wsn utilizing localization technique 36-46
5.a robust frame of wsn utilizing localization technique  36-465.a robust frame of wsn utilizing localization technique  36-46
5.a robust frame of wsn utilizing localization technique 36-46Alexander Decker
 
Final cnn shruthi gali
Final cnn shruthi galiFinal cnn shruthi gali
Final cnn shruthi galiSam Ram
 
Associative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networksAssociative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networkseSAT Publishing House
 
Auto Configuring Artificial Neural Paper Presentation
Auto Configuring Artificial Neural Paper PresentationAuto Configuring Artificial Neural Paper Presentation
Auto Configuring Artificial Neural Paper Presentationguestac67362
 
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...cscpconf
 
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networks
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor NetworksAccurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networks
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networksambitlick
 
Electron Microscopic Image Analysis
Electron Microscopic Image AnalysisElectron Microscopic Image Analysis
Electron Microscopic Image Analysisgumccomm
 
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...IJECEIAES
 
Artifact Classification of fMRI Networks
Artifact Classification of fMRI NetworksArtifact Classification of fMRI Networks
Artifact Classification of fMRI NetworksVanessa S
 
A cell based clustering algorithm in large wireless sensor networks
A cell based clustering algorithm in large wireless sensor networksA cell based clustering algorithm in large wireless sensor networks
A cell based clustering algorithm in large wireless sensor networksambitlick
 
FACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFFACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFAras Masood
 
A new study of dss based on neural network and data mining
A new study of dss based on neural network and data miningA new study of dss based on neural network and data mining
A new study of dss based on neural network and data miningAttaporn Ninsuwan
 
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...ijwmn
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 

Was ist angesagt? (20)

Brain Networks
Brain NetworksBrain Networks
Brain Networks
 
Evaluation of deep neural network architectures in the identification of bone...
Evaluation of deep neural network architectures in the identification of bone...Evaluation of deep neural network architectures in the identification of bone...
Evaluation of deep neural network architectures in the identification of bone...
 
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...
11.0005www.iiste.org call for paper.a robust frame of wsn utilizing localizat...
 
5.a robust frame of wsn utilizing localization technique 36-46
5.a robust frame of wsn utilizing localization technique  36-465.a robust frame of wsn utilizing localization technique  36-46
5.a robust frame of wsn utilizing localization technique 36-46
 
Final cnn shruthi gali
Final cnn shruthi galiFinal cnn shruthi gali
Final cnn shruthi gali
 
Associative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networksAssociative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networks
 
Auto Configuring Artificial Neural Paper Presentation
Auto Configuring Artificial Neural Paper PresentationAuto Configuring Artificial Neural Paper Presentation
Auto Configuring Artificial Neural Paper Presentation
 
Functional Brain Networks - Javier M. Buldù
Functional Brain Networks - Javier M. BuldùFunctional Brain Networks - Javier M. Buldù
Functional Brain Networks - Javier M. Buldù
 
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
 
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networks
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor NetworksAccurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networks
Accurate and Energy-Efficient Range-Free Localization for Mobile Sensor Networks
 
Bc34333339
Bc34333339Bc34333339
Bc34333339
 
Electron Microscopic Image Analysis
Electron Microscopic Image AnalysisElectron Microscopic Image Analysis
Electron Microscopic Image Analysis
 
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
 
Artifact Classification of fMRI Networks
Artifact Classification of fMRI NetworksArtifact Classification of fMRI Networks
Artifact Classification of fMRI Networks
 
A cell based clustering algorithm in large wireless sensor networks
A cell based clustering algorithm in large wireless sensor networksA cell based clustering algorithm in large wireless sensor networks
A cell based clustering algorithm in large wireless sensor networks
 
FACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRFFACE RECOGNITION USING ELM-LRF
FACE RECOGNITION USING ELM-LRF
 
A new study of dss based on neural network and data mining
A new study of dss based on neural network and data miningA new study of dss based on neural network and data mining
A new study of dss based on neural network and data mining
 
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...
WEIGHTED DYNAMIC DISTRIBUTED CLUSTERING PROTOCOL FOR HETEROGENEOUS WIRELESS S...
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
2013 ictc toha_slide
2013 ictc toha_slide2013 ictc toha_slide
2013 ictc toha_slide
 

Ähnlich wie Areejit Samal Randomizing metabolic networks

Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsMark Gerstein
 
IRJET- Artificial Neural Network: Overview
IRJET-  	  Artificial Neural Network: OverviewIRJET-  	  Artificial Neural Network: Overview
IRJET- Artificial Neural Network: OverviewIRJET Journal
 
self operating maps
self operating mapsself operating maps
self operating mapsAltafSMT
 
Artificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportArtificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportTodd Turner
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionYongsu Baek
 
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...IRJET Journal
 
Exploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningExploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningAdvanced-Concepts-Team
 
=Acs07 tania experim= copy
=Acs07 tania experim= copy=Acs07 tania experim= copy
=Acs07 tania experim= copyLaura Cruz Reyes
 
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Numenta
 
Analysis of Neocognitron of Neural Network Method in the String Recognition
Analysis of Neocognitron of Neural Network Method in the String RecognitionAnalysis of Neocognitron of Neural Network Method in the String Recognition
Analysis of Neocognitron of Neural Network Method in the String RecognitionIDES Editor
 

Ähnlich wie Areejit Samal Randomizing metabolic networks (20)

E04423133
E04423133E04423133
E04423133
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
IRJET- Artificial Neural Network: Overview
IRJET-  	  Artificial Neural Network: OverviewIRJET-  	  Artificial Neural Network: Overview
IRJET- Artificial Neural Network: Overview
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 
Kn2518431847
Kn2518431847Kn2518431847
Kn2518431847
 
Kn2518431847
Kn2518431847Kn2518431847
Kn2518431847
 
408 420
408 420408 420
408 420
 
As 7
As 7As 7
As 7
 
self operating maps
self operating mapsself operating maps
self operating maps
 
Artificial Neural Network Seminar Report
Artificial Neural Network Seminar ReportArtificial Neural Network Seminar Report
Artificial Neural Network Seminar Report
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
 
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
 
Exploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningExploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine Learning
 
=Acs07 tania experim= copy
=Acs07 tania experim= copy=Acs07 tania experim= copy
=Acs07 tania experim= copy
 
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
Could A Model Of Predictive Voting Explain Many Long-Range Connections? by Su...
 
Analysis of Neocognitron of Neural Network Method in the String Recognition
Analysis of Neocognitron of Neural Network Method in the String RecognitionAnalysis of Neocognitron of Neural Network Method in the String Recognition
Analysis of Neocognitron of Neural Network Method in the String Recognition
 
G013124354
G013124354G013124354
G013124354
 
Matlab 2013 14 papers astract
Matlab 2013 14 papers astractMatlab 2013 14 papers astract
Matlab 2013 14 papers astract
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 

Mehr von Areejit Samal

Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network AnalysisAreejit Samal
 
Modularity in metabolic networks
Modularity in metabolic networksModularity in metabolic networks
Modularity in metabolic networksAreejit Samal
 
Areejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic ModelAreejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic ModelAreejit Samal
 
Areejit Samal Regulation
Areejit Samal RegulationAreejit Samal Regulation
Areejit Samal RegulationAreejit Samal
 
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksAreejit Samal
 
Areejit samal fba_tutorial
Areejit samal fba_tutorialAreejit samal fba_tutorial
Areejit samal fba_tutorialAreejit Samal
 

Mehr von Areejit Samal (7)

Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network Analysis
 
Modularity in metabolic networks
Modularity in metabolic networksModularity in metabolic networks
Modularity in metabolic networks
 
Areejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic ModelAreejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic Model
 
Areejit Samal Regulation
Areejit Samal RegulationAreejit Samal Regulation
Areejit Samal Regulation
 
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networks
 
Areejit samal fba_tutorial
Areejit samal fba_tutorialAreejit samal fba_tutorial
Areejit samal fba_tutorial
 

Kürzlich hochgeladen

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Kürzlich hochgeladen (20)

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Areejit Samal Randomizing metabolic networks

  • 1. Large-scale structure of metabolic networks: Role of Biochemical and Functional constraints Areejit Samal Laboratoire de Physique Théorique et Modèles Statistiques (LPTMS), CNRS and Univ Paris-Sud, Orsay, France and Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany Areejit Samal
  • 2. Outline  Architectural features of metabolic networks  Issues with existing null models for metabolic networks  Our method to generate realistic randomized metabolic networks using Markov Chain Monte Carlo (MCMC) sampling and Flux Balance Analysis (FBA)  Biological function is a main driver of large-scale structure in metabolic networks Areejit Samal
  • 3. Architectural features of real-world networks Small-world Scale-free Watts and Strogatz, Nature (1998) Barabasi and Albert, Science (1999) Small Average Path Length Power law degree distribution & High Local Clustering Scale-free graphs can be generated using a simple model incorporating: Growth & Preferential attachment scheme Areejit Samal
  • 4. Architectural features of real-world networks Small-world Scale-free Watts and Strogatz, Nature (1998) Barabasi and Albert, Science (1999) Small Average Path Length Power law degree distribution & High Local Clustering Scale-free graphs can be generated using a simple model incorporating: Growth & Preferential attachment scheme Areejit Samal
  • 5. Large-scale structure of metabolic networks Small-world, Scale-free and Hierarchical Organization Wagner & Fell (2001) Jeong et al (2000) Ravasz et al (2002) Small Average Path Length, High Local Clustering, Power-law degree distribution Bow-tie architecture Directed graph with a giant component of strongly connected nodes along with associated IN and OUT component Bow-tie architecture of the metabolism is similar to that found for the WWW by Broder et al. Ma and Zeng, Bioinformatics (2003) Areejit Samal
  • 6. Large-scale structure of metabolic networks Small-world, Scale-free and Hierarchical Organization Wagner & Fell (2001) Jeong et al (2000) Ravasz et al (2002) Small Average Path Length, High Local Clustering, Power-law degree distribution Bow-tie architecture Directed graph with a giant component of strongly connected nodes along with associated IN and OUT component Bow-tie architecture of the metabolism is similar to that found for the WWW by Broder et al. Ma and Zeng, Bioinformatics (2003) Areejit Samal
  • 7. Network motifs and Evolutionary design Sub-graphs that are over- represented in real networks compared to randomized networks with same degree sequence. Certain network motifs were shown to perform important information processing tasks. For example, the coherent Feed Forward Loop (FFL) can be used to detect persistent signals. Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004) Areejit Samal
  • 8. Network motifs and Evolutionary design Sub-graphs that are over- represented in real networks compared to randomized networks with same degree sequence. Certain network motifs were shown to perform important information processing tasks. For example, the coherent Feed Forward Loop (FFL) can be used to detect persistent signals. Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004) Areejit Samal
  • 9. “Null hypothesis” based approach for identifying design principles in real-world networks Statistically significant network properties are deduced as follows: 300 1) Measure the ‘chosen property’ (E.g., 250 Investigated motif significance profile, average network 200 clustering coefficient, etc.) in the Frequency p-value investigated real network. 150 100 2) Generate randomized networks with 50 structure similar to the investigated real network using an appropriate null 0 0.60 0.62 0.64 0.66 0.68 0.70 model. Chosen Property 3) Use the distribution of the ‘chosen property’ for the randomized networks to estimate a p-value. Areejit Samal
  • 10. Edge-exchange algorithm: commonly-used null model Randomized networks preserve the degree at each node and the degree sequence. Investigated Network After 1 exchange After 2 exchanges Reference: Shen-Orr et al Nature Genetics (2002); Milo et al Science (2002,2004); Maslov & Sneppen Science (2002) Areejit Samal
  • 11. Carefully posed null models are required for deducing design principles with confidence C. elegans Neuronal Network: Close by neurons have a greater chance of forming a connection than distant neurons A null model incorporating this spatial constraint gives different results compared to generic edge- exchange algorithm. Artzy-Randrup et al Science (2004) Areejit Samal
  • 12. Carefully posed null models are required for deducing design principles with confidence C. elegans Neuronal Network: Close by neurons have a greater chance of forming a connection than distant neurons A null model incorporating this spatial constraint gives different results compared to generic edge- exchange algorithm. Artzy-Randrup et al Science (2004) Areejit Samal
  • 13. Edge-exchange algorithm: case of bipartite metabolic networks asp-L cit ASPT CITL nh4 fum ac oaa ASPT: asp-L → fum + nh4 CITL: cit → oaa + ac Areejit Samal
  • 14. Edge-exchange algorithm: case of bipartite metabolic networks cit asp-L cit Null-model used in asp-L many studies including: Guimera & Amaral Nature (2005); CITL ASPT* CITL* Samal et al BMC ASPT Bioinformatics (2006) ac oaa nh4 fum ac oaa nh4 fum ASPT: asp-L → fum + nh4 ASPT*: asp-L → ac + nh4 CITL: cit → oaa + ac CITL*: cit → oaa + fum Preserves degree of each node in the network But generates fictitious reactions that violate mass, charge and atomic balance satisfied by real chemical reactions!! Note that fum has 4 carbon atoms while ac has 2 carbon atoms in the example shown. Areejit Samal
  • 15. Edge-exchange algorithm: case of bipartite metabolic networks cit asp-L cit Null-model used in asp-L many studies including: Guimera & Amaral Nature (2005); CITL ASPT* CITL* Samal et al BMC ASPT Bioinformatics (2006) ac oaa nh4 fum ac oaa nh4 fum ASPT: asp-L → fum + nh4 ASPT*: asp-L → ac + nh4 CITL: cit → oaa + ac CITL*: cit → oaa + fum Preserves degree of each node in the network But generates fictitious reactions that violate mass, charge and atomic balance satisfied by real chemical reactions!! Note that fum has 4 carbon atoms while ac has 2 carbon atoms in the example shown. Areejit Samal
  • 16. Edge-exchange algorithm: case of bipartite metabolic networks cit asp-L cit Null-model used in asp-L many studies including: Guimera & Amaral Nature (2005); CITL ASPT* CITL* Samal et al BMC ASPT Bioinformatics (2006) ac oaa nh4 fum ac oaa nh4 fum ASPT: asp-L → fum + nh4 ASPT*: asp-L → ac + nh4 CITL: cit → oaa + ac CITL*: cit → oaa + fum Preserves degree of each node in the network But generates fictitious reactions that violate A common generic null model mass, charge and atomic balance satisfied by (edge exchange algorithm) is real chemical reactions!! unsuitable for different types of real-world networks. Note that fum has 4 carbon atoms while ac has 2 carbon atoms in the example shown. Areejit Samal
  • 17. Null models for metabolic networks: Blind-watchmaker network Areejit Samal
  • 18. Null models for metabolic networks: Blind-watchmaker network Areejit Samal
  • 19. Randomizing metabolic networks We propose a new method using Markov Chain Monte Carlo (MCMC) sampling and Flux Balance Analysis (FBA) to generate meaningful randomized ensembles for metabolic networks by successively imposing biochemical and functional constraints. Ensemble R Increasing level of constraints Given no. of valid Constraint I biochemical reactions + Ensemble Rm Constraint II Fix the no. of metabolites + Ensemble uRm Constraint III Exclude Blocked Reactions + Ensemble uRm-V1 Functional constraint of Constraint IV growth on specified environments Areejit Samal
  • 20. Comparison with E. coli: Large-scale structural properties of interest In this work, we will generate randomized metabolic networks using only reactions in KEGG and compare the structural properties of randomized networks with those of E. coli. We study the following large-scale structural properties: • Metabolite Degree distribution • Clustering Coefficient • Average Path Length Scale Free and Small World Ref: Jeong et al (2000) Wagner & Fell (2001) • Pc: Probability that a path exists between two nodes in the directed graph • Largest strongly component (LSC) and the union of LSC, IN and OUT components Bow-tie architecture of the E. coli metabolic network has m (~750) metabolites Internet and metabolism and r (~900) reactions. Ref: Broder et al; Ma and Zeng, Bioinformatics (2003) Areejit Samal
  • 21. glc-D HEX1: atp + glc-D → adp + g6p + h atp PGI: g6p ↔ f6p Different Graph-theoretic Representations of the PFK: atp + f6p → adp + fdp + h HEX1 Currency Other Metabolite Metabolite Irreversible Reversible g6p Reaction Metabolic Network Reaction PGI glc-D f6p g6p PFK f6p h fdp adp fdp (a) Bipartite Representation (b) Unipartite Representation
  • 22. glc-D HEX1: atp + glc-D → adp + g6p + h atp PGI: g6p ↔ f6p Different Graph-theoretic Representations of the PFK: atp + f6p → adp + fdp + h HEX1 Currency Other Metabolite Metabolite Degree Distribution Irreversible Reversible g6p Reaction Metabolic Network Reaction PGI glc-D Clustering coefficient Average Path Length Largest Strong Component f6p g6p PFK f6p h fdp adp fdp (a) Bipartite Representation (b) Unipartite Representation
  • 23. glc-D HEX1: atp + glc-D → adp + g6p + h atp PGI: g6p ↔ f6p Different Graph-theoretic Representations of the PFK: atp + f6p → adp + fdp + h HEX1 Currency Other Metabolite Metabolite Degree Distribution Irreversible Reversible g6p Reaction Metabolic Network Reaction PGI glc-D Clustering coefficient Average Path Length Largest Strong Component f6p g6p The values obtained for different network measures depend on the list PFK f6p of currency metabolites used to generate the unipartite graph of metabolites starting from the bipartite graph. In this work, we have h usedfdpwell defined biochemically sound list of currency metabolites to a adp fdp construct unipartite graph corresponding to each metabolic network. (a) Bipartite Representation (b) Unipartite Representation
  • 24. Constraint I: Given number of valid biochemical reactions  Instead of generating fictitious reactions using edge exchange mechanism, we can restrict to >6000 valid reactions contained within KEGG database.  We use the database of 5870 mass balanced reactions derived from KEGG by Rodrigues and Wagner. For simplicity, I will refer to this database in the talk as ‘KEGG’ Areejit Samal
  • 25. Global Reaction Set Nutrient metabolites Biomass composition The set of external The E. coli biomass metabolites in iJR904 composition of iJR904 were allowed for was used to determine uptake/excretion in the + viability of genotypes. global reaction set. We can implement Flux Balance Analysis (FBA) within this database. KEGG Database + E. coli iJR904 Areejit Samal
  • 26. Genome-scale metabolic models: Flux Balance Analysis (FBA) List of metabolic reactions with stoichiometric coefficients Growth rate for Flux Balance the given medium Biomass composition Analysis (FBA) ‘Biomass Yield’ Fluxes of all reactions Set of nutrients in the environment Advantages FBA does not require enzyme kinetic information which is not known for most reactions. Disadvantages FBA cannot predict internal metabolite concentrations and is restricted to steady states. Basic models do not account for metabolic regulation. Reference: Varma and Palsson, Biotechnology (1994); Price et al Nature Reviews Microbiology (2004) Areejit Samal
  • 27. Bit string representation of metabolic networks within the global reaction set KEGG Database E. coli R1: 3pg + nad → 3php + h + nadh 1 Present - 1 R2: 6pgc + nadp → co2 + nadph + ru5p-D 0 R3: 6pgc → 2ddg6p + h2o 1 Absent - 0 . . . . . . 6000 C900 RN: --------------------------------- . Contains r reactions N = 5870 reactions within KEGG (1’s in the bit string) (or global reaction set) The E. coli metabolic network or any random network of reactions within KEGG can be represented as a bit string of length N with exactly r entries equal to 1. r is the number of reactions in the E. coli metabolic network. Areejit Samal
  • 28. Ensemble R: Random networks within KEGG with same number of reactions as E. coli Random subsets or networks with r reactions To compare with E. coli, we generate an ensemble R of random networks as KEGG with N=5870 follows: reactions Pick random subsets with exactly r (~900) reactions within KEGG database of N (~6000) reactions. r is the no. of reactions in E. coli. 1 0 1 1 0 0 1 0 0 1 1 1 Huge no. of networks possible within . . . . KEGG with exactly r reactions . . . . 6000 ~ C900 . . . . . . . . Fixing the no. of reactions is equivalent to Each random network or bit string fixing genome size. contains exactly r reactions (1’s in the string) like in E. coli. Areejit Samal
  • 29. Metabolite degree distribution A degree of a metabolite is the number of reactions in which the metabolite participates in the network. 100 10-1 10-2 P(k) 10-3 10-4 E. coli 10-5 10-6 1 10 100 1000 k E. coli: = 2.17; Most organisms:  is 2.1-2.3 Reference: Jeong et al (2000); Wagner and Fell (2001) Areejit Samal
  • 30. Metabolite degree distribution A degree of a metabolite is the number of reactions in which the metabolite participates in the network. 100 100 10-1 10-1 10-2 10-2 P(k) P(k) 10-3 10-3 10-4 10-4 E. coli E. coli 10-5 Random 10-5 10-6 10-6 1 10 100 1000 1 10 100 1000 k k E. coli: = 2.17; Random networks in Most organisms:  is 2.1-2.3 ensemble R: = 2.51 Reference: Jeong et al (2000); Wagner and Fell (2001) Areejit Samal
  • 31. Metabolite degree distribution A degree of a metabolite is the number of reactions in which the metabolite participates in the network. 100 100 100 10-1 -1 10-1 10 10-2 10-2 10-2 10-3 P(k) P(k) P(k) 10-3 10-4 10-3 10-5 10-4 10-4 E. coli 10-6 E. coli E. coli Random 10-5 Random 10-5 10-7 KEGG 10-6 10-8 10-6 1 10 100 1000 1 10 100 1000 1 10 100 1000 k k k E. coli: = 2.17; Random networks in Complete KEGG: Most organisms:  is 2.1-2.3 ensemble R: = 2.51 = 2.31 Reference: Jeong et al (2000); Wagner and Fell (2001) Any (large enough) random subset of reactions in KEGG has Power Law degree distribution !! Areejit Samal
  • 32. Reaction Degree Distribution The degree of a reaction is the number of metabolites that participate in it. 3000 0 Currency 1 Currency 2500 2 Currency 3 Currency atp adp 4 Currency 5 Currency 2000 6 Currency R Frequency 7 Currency 1500 A B 1000 In each reaction, we have counted the number of currency and other metabolites which is shown in 500 the figure. Most reactions involve 4 metabolites of which 2 0 0 2 4 6 8 10 12 are currency metabolites and 2 are other Number of substrates in a reaction metabolites. The reaction degree distribution is bell shaped with typical reaction in the network involving 4 metabolites. The nature of reaction degree distribution is very different from the metabolite degree distribution which follows a power law. Areejit Samal
  • 33. Scale-free versus Scale-rich network: Origin of Power laws in metabolic networks Tanaka and Doyle have suggested a classification of metabolites into three categories based on their biochemical roles (a) ‘Carriers’ (very high degree) atp adp (b) ‘Precursors’ (intermediate degree) R (c) ‘Others’ (low degree) A B It has been argued that gene duplication and divergence at the level of protein interaction networks can lead to observed power laws in metabolic networks. However, the presence of ubiquitous (high degree) ‘currency’ metabolites along with low degree ‘other’ metabolites in each reaction can explain the observed fat tail in the metabolite distribution. Reference: Tanaka (2005); Tanaka, Csete, and Doyle (2008) Areejit Samal
  • 34. Network Properties: Given number of valid biochemical reactions 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 35. Network Properties: Given number of valid biochemical reactions 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 36. Constraint II: Fix the number of metabolites Direct sampling of randomized networks with same no. of reactions and metabolites as in E. coli is not possible. Markov Chain Monte Carlo (MCMC) sampling of randomized networks KEGG Database E. coli Step 1 Step 2 Ensemble RM R1: 3pg + nad → 3php + h + nadh 1 1 1 1 0 0 Swap 1 Swap 0 106 Swaps 1 10 Swaps 3 0 R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o 1 0 1 1 1 . 0 0 1 0 0 Accept . 1 1 0 0 1 . . . Erase . Saving . . memory of RN: --------------------------------- . . . . Frequency . initial N = 5870 reactions within KEGG network reaction universe Reject Reject store store Reaction swap keeps the number of reactions in network constant Accept/Reject Criterion of reaction swap: Accept if the no. of metabolites in the new network is less than or equal to E. coli. Areejit Samal
  • 37. Network Properties: After fixing number of metabolites 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 38. Network Properties: After fixing number of metabolites 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 39. Constraint III: Exclude Blocked Reactions  Large-scale metabolic networks typically have certain dead-end or blocked reactions which cannot contribute to KEGG growth of the organism. Unblocked KEGG  Blocked reactions have zero flux for every investigated chemical environment under any steady-state condition and do not contribute to biomass growth flux.  We next exclude blocked reactions from our biochemical universe used for sampling randomized networks. For Blocked reactions, see Burgard et al Genome Research (2004) Areejit Samal
  • 40. MCMC Sampling: Exclusion of blocked reactions Ensemble uRM KEGG Database E. coli Step 1 Step 2 R1: 3pg + nad → 3php + h + nadh 1 1 1 1 0 0 Swap 1 Swap 0 106 Swaps 1 10 Swaps 3 0 R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o 1 0 1 1 1 . 0 0 1 0 0 Accept Erase Saving . 1 1 0 0 1 . . . memory of . Frequency . . initial RN: --------------------------------- . . . network . . N = 5870 reactions within KEGG reaction universe Reject Reject store store Accept/Reject Criterion of reaction swap: Restrict the reaction swaps to the set of unblocked reactions within KEGG. Accept if the no. of metabolites in the new network is less than or equal to E. coli. Areejit Samal
  • 41. Network Properties: After excluding blocked reactions 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 42. Network Properties: After excluding blocked reactions 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 43. Constraint IV: Growth Phenotype Definition of Phenotype Genotypes Ability to synthesize Viability biomass components 0 1 1 . Viable genotype . . Flux Growth . GA Balance 0 Analysis No Growth 0 (FBA) 1 Unviable . genotype x . . Deleted . GB reaction Bit string and equivalent network representation of genotypes For FBA, see Price et al (2004) for a review Areejit Samal
  • 44. MCMC Sampling: Growth phenotype in one environment Ensemble uRM-V1 KEGG Database E. coli Step 1 Step 2 R1: 3pg + nad → 3php + h + nadh 1 1 1 1 0 0 Swap 1 Swap 0 106 Swaps 1 10 Swaps 3 0 R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o 1 0 1 1 1 . 0 0 1 0 0 Accept . 1 1 0 Erase 0 Saving 1 . . . . memory of . Frequency . RN: --------------------------------- . . . initial . . network N = 5870 reactions within KEGG reaction universe Reject Reject store store Accept/Reject Criterion of reaction swap: Restrict the reaction swaps to the set of unblocked reactions within KEGG. Accept if (a) No. of metabolites in the new network is less than or equal to E. coli (b) New network is able to grow on the specified environment Areejit Samal
  • 45. Network Properties: Growth phenotype in one environment 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 46. Network Properties: Growth phenotype in one environment 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 47. Network Properties: Growth phenotype in multiple environments 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 48. Network Properties: Growth phenotype in multiple environments 10 0.10 8 0.08 6 <L> <C> 0.06 4 0.04 2 0.02 0.00 0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Clustering Coefficient Path Length 0.8 1.0 LSC LSC + IN + OUT 0.6 0.8 Fraction of nodes 0.6 Pc 0.4 0.4 0.2 0.2 0.0 0.0 R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli R RM uRM uRM-V1 uRM-V5 uRM-V10 E. coli Probability that a path exists Largest Strong Component between two nodes Areejit Samal
  • 49. Global structural properties of real metabolic networks: Consequence of simple biochemical and functional constraints Increasing level of constraints Conjectured by: A. Wagner (2007) and B. Papp et al. (2008) Samal and Martin, PLoS ONE (2011) 6(7): e22295 Areejit Samal
  • 50. Global structural properties of real metabolic networks: Consequence of simple biochemical and functional constraints Increasing level of constraints Conjectured by: A. Wagner (2007) and B. Papp et al. (2008) Samal and Martin, PLoS ONE (2011) 6(7): e22295 Areejit Samal
  • 51. High level of genetic diversity in our randomized ensembles Any two random networks in our most constrained ensemble uRM-v10 differ in ~ 60% of their reactions. 1 1 1 1 0 0 0 0 1 0 1 0 . . . versus . versus . . . . . . . . . . . . E. coli Random network in Random network in Random network in ensemble uRM-V10 ensemble uRM-V10 ensemble uRM-V10 Hamming distance between the two networks is ~ 60% of the maximum possible between two bit strings. There is a set of ~ 100 reactions which are present in all of our random networks. Areejit Samal
  • 52. Necessity of MCMC sampling Ensemble uRm-V10 Ensemble uRm-V5 Ensemble uRm-V1 Ensemble uRm Ensemble Rm Ensemble R Embedded sets like Russian dolls 100 Fraction of genotypes that are viable 10-5 10-10 Including the viability constraint on the first 10-15 Glucose chemical environment leads to a reduction by at 10-20 Acetate Succinate Analytic prefactor least a factor 1022 2000 2200 2400 2600 2800 Reference: Samal et al, BMC Systems Biology (2010) Number of reactions in genotype Areejit Samal
  • 53. Summary  We have proposed a new method based on MCMC sampling and FBA to generate randomized metabolic networks by successively imposing few macroscopic biochemical and functional constraints into our sampling criteria.  By imposing a few macroscopic constraints into our sampling criteria, we were able to approach the properties of real metabolic networks.  We conclude that biological function is a main driver of structure in metabolic networks. Areejit Samal
  • 54. Limitations and Future Outlook  We can extend the study using the SEED database to many organisms. Automated reconstruction of metabolic models for >140 organisms on which FBA can be performed.  The list of reactions in KEGG is incomplete and its use may reflect evolutionary constraints associated with natural organisms.  From a synthetic biology context, there could be many more reactions that can occur which are not in KEGG.  Alternate approach proposed by Basler et al, Bioinformatics, 2011 which generates new mass-balanced reactions but does not impose functionality constraint. Generate new mass-balanced reactions but no constraint of biomass production. Areejit Samal
  • 55. Genotype networks: A many-to-one genotype to phenotype map RNA Sequence Protein Sequence Gene network Genotype folding Phenotype Secondary structure Three dimensional Gene expression of pairings structure levels Reference: Fontana & Schuster (1998) Lipman & Wilbur (1991) Ciliberti, Martin & Wagner (2007) Genotype network refers to the set of (many) genotypes that have a given phenotype, and their “organization” in genotype space. In the context of RNA, genotype networks are commonly referred to as Neutral networks. Areejit Samal
  • 56. Properties of the genotype-to-phenotype map  Are there many genotypes that have a given phenotype?  Do the genotypes with a given phenotype form a significant fraction of all possible genotypes?  Are all the genotypes with a given phenotype rather similar?  Are biological genotypes atypically robust compared to random genotypes with similar phenotype?  Are biological genotypes atypically innovative?  Can genotypes change gradually in response to selective pressures? A. Wagner, Nat. Rev. Genet. (2008) Areejit Samal
  • 57. Properties of the genotype-to-phenotype map  Are there many genotypes that have a given phenotype?  Do the genotypes with a given phenotype form a significant fraction of all possible genotypes? We have considered these questions in the context of metabolic  Are all the genotypes with a given phenotype rather similar? networks.  Are biological genotypes atypically robust enzymes or reactions Genotype = lists of metabolic compared to random genotypes with similar in a given environment Phenotype = growth or not phenotype?  Are biological genotypes atypically innovative?  Can genotypes change gradually in response to selective pressures? A. Wagner, Nat. Rev. Genet. (2008) Areejit Samal
  • 58. Computational tool for Evolutionary Systems Biology Areejit Samal
  • 59. Acknowledgement Reference Olivier C. Martin Université Paris-Sud Other Work Andreas Wagner João Rodrigues Jürgen Jost University of Zurich Max Planck Institute for Mathematics in the Sciences Genotype networks in metabolic reaction spaces Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner* BMC Systems Biology 2010, 4:30 Environmental versatility promotes modularity in genome-scale metabolic networks Areejit Samal, Andreas Wagner* and Olivier C Martin* BMC Systems Biology 2011, 5:135 Funding CNRS & Max Planck Society Joint Program in Systems Biology (CNRS MPG GDRE 513) for a Postdoctoral Fellowship. Areejit Samal