SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Probabilistic refinement of
               cellular pathway models
                         Cambridge Statistical Laboratory
                                Networks seminar series
                                             2009 Jan 21


Florian Markowetz
florian.markowetz@cancer.org.uk
What is a signaling pathway?

            Environmental
            stimuli


                                           Protein
 Receptor in
 cell membrane


                            Pat
                                hw
                                           mRNA
 Protein cascade
                               ay
 Transcription factors
 regulating target genes             DNA
Pathway reconstruction
Signaling pathways are important
- Deregulation causes many diseases incl. cancer
Signaling pathways are poorly understood
- Only parts-lists
- missing are interactions within and between pathways
Biological research
- So far mostly focused on individual genes
New genome-scale datasets
- Opportunity for data integration and novel methods
What data do we have?
            Proteins:
            - interactions between proteins
                                                    Bulk of data:
            - binding to DNA
                                                     Microarray

                               mRNA:
          Protein
                               - Expression under
                               different stimuli
                               - binding to DNA
          mRNA
                                    Sequence:
                                    - binding motifs
                                    - epigenetic marks
    DNA
                               Morphology
Pathways as graphs
   • Nodes are (mostly) known
   • Goal: infer edges from data
   • Data are heterogeneous
                     • co-expression between
      Edges          genes
                     • interactions between
                     proteins
                     • binding motifs at genes
                     • binding of proteins to
      Nodes          • Protein domains
                     DNA
                     • Functional annotation
               • Cause-effect data:
      Paths    • changing environments
               • experimental perturbations
Pathway reconstruction
“Classical” statistical approaches:
Treat the genes/proteins as random variables and
   explore correlation structure in the data:
   – Correlation graphs
   – Gaussian graphical models (partial correlation)
   – Bayesian networks

Challenges/Problems/Opportunities
1. Correlation may be un-informative
2. Integrate heterogeneous and noisy and
                      complementary data sources
                               Review: Markowetz and Spang (2007)
– Part 1 –

Nested Effects Models
Experimental perturbations
                                      Drugs
 Small
 molecules
                                              RNAi
                            Protein
 Stress

                                              Knockout
                            mRNA




                     DNA


Readout:
Global gene expression measurements
Drosophila immune response
Columns: perturbed genes
Rows: effects on other genes

1. Silencing tak1 reduces
   expression of all LPS-
   inducible transcripts
2. Silencing rel (key) or
   mkk4/hep reduces
   expression of subsets of
   induced transcripts

(Boutros et al, Dev Cell 2002)
(!) Two types of entities

  Components of signaling
    pathway which are
    experimentally
    perturbed



  Downstream effect
    reporters
(!!) Only indirect information

No direct observation of
 perturbation effects on
 other pathway
 components!


Inference from observed
  perturbation effects on
  downstream reporters.
The information gap

Direct information:            Indirect information:
effects are visible at other   effects are only visible at
pathway components             down-stream reporters
Pathway                        Pathway
          B                          B
                    D                                D
                                 A               C
  A           C

                                   - Cell survival or death
                                   - Growth rate
                                   - downstream genes
Correlation won’t do
                       “Classical” approach
Pathway                             Correlation
      B            D                Graphical models:
                                    - Bayes Nets
  A           C                     - GGMs
                                    Mutual Information


                                      Nested
      Downstream
                                      Effects
       regulated
         genes
                                      Models
Nested Effects Models
                             1. Set of candidate pathway genes
INPUT
                             2. High-dimensional phenotypic profile, e.g. microarray

       Graph representation of information flow explaining
OUTPUT
       the phenotypes
                                 Phenotypic profiles      Inferred pathway
        Gene perturbations




                             A
                                                                    AB
                             B
                             C
                             D                                               EF
                                                            CD
                             E
                             F
                             G                                      GH
                             H

                                         Effects
NEM: model formulation
M’xyz:                                         Expected                Observed
                              Z
          X         Y                  X                       X           FN   FN
                                       Y                       Y      FP
                                       Z                       Z                     FN
     E1   E2   E3       E4   E5   E6       E1 E2 E3 E4 E5 E6       E1 E2 E3 E4 E5 E6


Pathway genes: X, Y, Z                     Effect reporters: E1, …, E6
• core topology                            • states are observed
• to be reconstructed                           = Data D
    = Model M                              • positions in pathway unknown
                                               = Parameters θ
                                              Marginal likelihood
Posterior: P ( M | D ) = 1/Z . P( D | M ) . P( M )
Likelihood P( D | M, θ )

              Compare predictions with observations:
     Y
                    Prediction             E1=0       E2=1
X        Z
                    Observation         1. E1=1       E2=1
                                        2. E1=0       E2=1
E1       E2

Error probabilities
        e.g. false NEG rate 20%, false POS rate 5%
 Lik = Pr( E1 = 1) ⋅ Pr( E2 = 1) ⋅ Pr( E1 = 0) ⋅ Pr( E2 = 1)
     = 0.05 ⋅ 0.95 ⋅ 0.80 ⋅ 0.95
Marginal likelihood

 P ( D | M ) = ∫ P ( D | M , Θ ) P (Θ | M ) dΘ
                      m         l
                           n
               1
                    ∏∑∏ P(e                 | M ,θ i = j )
             =m                        ik
              n      i =1 j =1 k =1
Uniform
prior over
positions
                                             Distribution of
                                             single effect
Product over
                                Product over reporter with
all effect   Average over
             possible positions replicate    known position
reporters
                                observation
             in the pathway
NEM: inference
Model space: all transitively closed directed graphs
Exhaustive enumeration: score all models to find
  the one fitting the data best
               Markowetz et al. Bioinformatics, 2005
MCMC, Simulated Annealing: take small
 probabilistic steps to explore model space
                    . . . with A Tresch; in preparation
Divide and conquer: break a big model into smaller,
  manageable pieces and then re-assemble
                       Markowetz et al. ISMB 2007
NEM: extensions




                               Likelihood based on
Drop transitivity
  requirement                  log-ratios of effects


                    Feature selection to concentrate on
                    informative effect reporters



                           Tresch and Markowetz (2008)
NEMs on Drosophila data
Summary of part 1

1. Gene perturbation screens with gene-
   expression readouts
2. Perturbation screens suffer from the
   information gap between pathways and
   reporters
3. Nested Effects Models reconstruct pathway
   features from subset relations between
   observed effects
– Part 2 –

      Data integration and
   probabilistic refinement of
a signaling pathway hypothesis
Pathway refinement
     1. Start from given pathway hypothesis
      Even if our understanding of pathways is poor, that does
                                not mean we have none at all!
     2. Evaluate evidence for hypothesis in
        data
     3. Identify weakly supported areas and
        likely extensions
     Not reconstruction from scratch.
     Step 1: assemble pathway hypothesis
        (KEGG, literature, …) for pheromone
        response pathway in Yeast
Edge data I
              Support for hypothesis in
      protein-protein interaction data
Edge data II
          Support for hypothesis in
              co-expression data
Edge data III
   Why is it so hard to reconstruct
   nuclear regulatory network from
   correlations?
Edge data IV
               Support for hypothesis in
                 TF-DNA binding data
Paths: cause-effect data
         Expression profiling of knock-out mutants
                              (Hughes et al., 2000)




              Result:
              transcriptional response to perturbation
              only visible on down-stream genes
              (information gap!)
Conclusion from data analysis

• Every data source is informative for a specific
  compartment of the pathway
• No data source is informative in all
  compartments
• We expect these observations also to hold for
  other MAPK and signaling pathways.

Need compartment-specific integrative model
 encompassing edge, node, and path data.
Integrative model
                                Conditional distributions
                                for each data type
   Pathway graph as
   hidden/latent
   variables



        Prior                                Parameters




Graphical model defines
                                Different data types contribute
posterior P(G|data)
                                to each compartment
-> inference by Gibbs sampler
Evaluation

1. Fit model parameters on pheromone
   response pathway (training)
2. Use fitted model on other MAPK pathways
   (generalization to closely related examples)
3. Use fitted model on all other Yeast signaling
   pathways (generalization to everything else)

            … work in progress …
Acknowledgements
Nested Effects Models
Rainer Spang (Univ. Regensburg) .:. Dennis
 Kostka (UC SF) .:. Achim Tresch (Gene Center
 Munich) .:. Holger Fröhlich (DKFZ Heidelberg)
 .:. Tim Beißbarth (Univ. Göttingen) .:. Josh
 Stuart, Charlie Vaske (UC SC) .:.
Data integration
Olga G. Troyanskaya (Princeton) .:. Edoardo
 Airoldi (Harvard) .:. David Blei (Princeton) .:.
Probabilistic refinement of
            cellular pathway models


        Thank you !
Florian Markowetz
florian.markowetz@cancer.org.uk

Weitere ähnliche Inhalte

Was ist angesagt? (10)

P bluescript
P bluescriptP bluescript
P bluescript
 
P 53 Tumour Biology
P 53 Tumour BiologyP 53 Tumour Biology
P 53 Tumour Biology
 
P uc vectors
P uc vectorsP uc vectors
P uc vectors
 
Bacteriophage based vector
Bacteriophage based vectorBacteriophage based vector
Bacteriophage based vector
 
Natalia Cucu Simp 09
Natalia Cucu Simp 09Natalia Cucu Simp 09
Natalia Cucu Simp 09
 
The Molecular Genetics Of Immunoglobulins
The Molecular Genetics Of ImmunoglobulinsThe Molecular Genetics Of Immunoglobulins
The Molecular Genetics Of Immunoglobulins
 
Lecture on pUC18 vector
Lecture on pUC18 vectorLecture on pUC18 vector
Lecture on pUC18 vector
 
Derivatives of pBR322
Derivatives of pBR322Derivatives of pBR322
Derivatives of pBR322
 
pUC18 vector
pUC18 vector pUC18 vector
pUC18 vector
 
P br322
P br322P br322
P br322
 

Ähnlich wie Probabilistic refinement of cellular pathway models

Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model librarylaserxiong
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisUSD Bioinformatics
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartAlexander Pico
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Vall d'Hebron Institute of Research (VHIR)
 
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisExploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisMaté Ongenaert
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Sage Base
 
Software for SBML Today
Software for SBML TodaySoftware for SBML Today
Software for SBML TodayMike Hucka
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Sage Base
 
Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Sage Base
 
Next Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert PosterNext Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert Postersptaylor
 
Next Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert SyndromeNext Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert Syndromesptaylor
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Sage Base
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Sage Base
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
Genomic selection in Livestock
Genomic  selection in LivestockGenomic  selection in Livestock
Genomic selection in LivestockILRI
 

Ähnlich wie Probabilistic refinement of cellular pathway models (20)

Pradeep.ii
Pradeep.iiPradeep.ii
Pradeep.ii
 
Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model library
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuart
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...
 
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisExploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
 
Software for SBML Today
Software for SBML TodaySoftware for SBML Today
Software for SBML Today
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21
 
gene_concept_2.pdf
gene_concept_2.pdfgene_concept_2.pdf
gene_concept_2.pdf
 
Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Friend NIGM 2012-05-23
Friend NIGM 2012-05-23
 
Next Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert PosterNext Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert Poster
 
Next Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert SyndromeNext Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert Syndrome
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
Genomic selection in Livestock
Genomic  selection in LivestockGenomic  selection in Livestock
Genomic selection in Livestock
 

Kürzlich hochgeladen

How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Kürzlich hochgeladen (20)

How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 

Probabilistic refinement of cellular pathway models

  • 1. Probabilistic refinement of cellular pathway models Cambridge Statistical Laboratory Networks seminar series 2009 Jan 21 Florian Markowetz florian.markowetz@cancer.org.uk
  • 2. What is a signaling pathway? Environmental stimuli Protein Receptor in cell membrane Pat hw mRNA Protein cascade ay Transcription factors regulating target genes DNA
  • 3. Pathway reconstruction Signaling pathways are important - Deregulation causes many diseases incl. cancer Signaling pathways are poorly understood - Only parts-lists - missing are interactions within and between pathways Biological research - So far mostly focused on individual genes New genome-scale datasets - Opportunity for data integration and novel methods
  • 4. What data do we have? Proteins: - interactions between proteins Bulk of data: - binding to DNA Microarray mRNA: Protein - Expression under different stimuli - binding to DNA mRNA Sequence: - binding motifs - epigenetic marks DNA Morphology
  • 5. Pathways as graphs • Nodes are (mostly) known • Goal: infer edges from data • Data are heterogeneous • co-expression between Edges genes • interactions between proteins • binding motifs at genes • binding of proteins to Nodes • Protein domains DNA • Functional annotation • Cause-effect data: Paths • changing environments • experimental perturbations
  • 6. Pathway reconstruction “Classical” statistical approaches: Treat the genes/proteins as random variables and explore correlation structure in the data: – Correlation graphs – Gaussian graphical models (partial correlation) – Bayesian networks Challenges/Problems/Opportunities 1. Correlation may be un-informative 2. Integrate heterogeneous and noisy and complementary data sources Review: Markowetz and Spang (2007)
  • 7. – Part 1 – Nested Effects Models
  • 8. Experimental perturbations Drugs Small molecules RNAi Protein Stress Knockout mRNA DNA Readout: Global gene expression measurements
  • 9. Drosophila immune response Columns: perturbed genes Rows: effects on other genes 1. Silencing tak1 reduces expression of all LPS- inducible transcripts 2. Silencing rel (key) or mkk4/hep reduces expression of subsets of induced transcripts (Boutros et al, Dev Cell 2002)
  • 10. (!) Two types of entities Components of signaling pathway which are experimentally perturbed Downstream effect reporters
  • 11. (!!) Only indirect information No direct observation of perturbation effects on other pathway components! Inference from observed perturbation effects on downstream reporters.
  • 12. The information gap Direct information: Indirect information: effects are visible at other effects are only visible at pathway components down-stream reporters Pathway Pathway B B D D A C A C - Cell survival or death - Growth rate - downstream genes
  • 13. Correlation won’t do “Classical” approach Pathway Correlation B D Graphical models: - Bayes Nets A C - GGMs Mutual Information Nested Downstream Effects regulated genes Models
  • 14. Nested Effects Models 1. Set of candidate pathway genes INPUT 2. High-dimensional phenotypic profile, e.g. microarray Graph representation of information flow explaining OUTPUT the phenotypes Phenotypic profiles Inferred pathway Gene perturbations A AB B C D EF CD E F G GH H Effects
  • 15. NEM: model formulation M’xyz: Expected Observed Z X Y X X FN FN Y Y FP Z Z FN E1 E2 E3 E4 E5 E6 E1 E2 E3 E4 E5 E6 E1 E2 E3 E4 E5 E6 Pathway genes: X, Y, Z Effect reporters: E1, …, E6 • core topology • states are observed • to be reconstructed = Data D = Model M • positions in pathway unknown = Parameters θ Marginal likelihood Posterior: P ( M | D ) = 1/Z . P( D | M ) . P( M )
  • 16. Likelihood P( D | M, θ ) Compare predictions with observations: Y Prediction E1=0 E2=1 X Z Observation 1. E1=1 E2=1 2. E1=0 E2=1 E1 E2 Error probabilities e.g. false NEG rate 20%, false POS rate 5% Lik = Pr( E1 = 1) ⋅ Pr( E2 = 1) ⋅ Pr( E1 = 0) ⋅ Pr( E2 = 1) = 0.05 ⋅ 0.95 ⋅ 0.80 ⋅ 0.95
  • 17. Marginal likelihood P ( D | M ) = ∫ P ( D | M , Θ ) P (Θ | M ) dΘ m l n 1 ∏∑∏ P(e | M ,θ i = j ) =m ik n i =1 j =1 k =1 Uniform prior over positions Distribution of single effect Product over Product over reporter with all effect Average over possible positions replicate known position reporters observation in the pathway
  • 18. NEM: inference Model space: all transitively closed directed graphs Exhaustive enumeration: score all models to find the one fitting the data best Markowetz et al. Bioinformatics, 2005 MCMC, Simulated Annealing: take small probabilistic steps to explore model space . . . with A Tresch; in preparation Divide and conquer: break a big model into smaller, manageable pieces and then re-assemble Markowetz et al. ISMB 2007
  • 19. NEM: extensions Likelihood based on Drop transitivity requirement log-ratios of effects Feature selection to concentrate on informative effect reporters Tresch and Markowetz (2008)
  • 21. Summary of part 1 1. Gene perturbation screens with gene- expression readouts 2. Perturbation screens suffer from the information gap between pathways and reporters 3. Nested Effects Models reconstruct pathway features from subset relations between observed effects
  • 22. – Part 2 – Data integration and probabilistic refinement of a signaling pathway hypothesis
  • 23. Pathway refinement 1. Start from given pathway hypothesis Even if our understanding of pathways is poor, that does not mean we have none at all! 2. Evaluate evidence for hypothesis in data 3. Identify weakly supported areas and likely extensions Not reconstruction from scratch. Step 1: assemble pathway hypothesis (KEGG, literature, …) for pheromone response pathway in Yeast
  • 24. Edge data I Support for hypothesis in protein-protein interaction data
  • 25. Edge data II Support for hypothesis in co-expression data
  • 26. Edge data III Why is it so hard to reconstruct nuclear regulatory network from correlations?
  • 27. Edge data IV Support for hypothesis in TF-DNA binding data
  • 28. Paths: cause-effect data Expression profiling of knock-out mutants (Hughes et al., 2000) Result: transcriptional response to perturbation only visible on down-stream genes (information gap!)
  • 29. Conclusion from data analysis • Every data source is informative for a specific compartment of the pathway • No data source is informative in all compartments • We expect these observations also to hold for other MAPK and signaling pathways. Need compartment-specific integrative model encompassing edge, node, and path data.
  • 30. Integrative model Conditional distributions for each data type Pathway graph as hidden/latent variables Prior Parameters Graphical model defines Different data types contribute posterior P(G|data) to each compartment -> inference by Gibbs sampler
  • 31. Evaluation 1. Fit model parameters on pheromone response pathway (training) 2. Use fitted model on other MAPK pathways (generalization to closely related examples) 3. Use fitted model on all other Yeast signaling pathways (generalization to everything else) … work in progress …
  • 32. Acknowledgements Nested Effects Models Rainer Spang (Univ. Regensburg) .:. Dennis Kostka (UC SF) .:. Achim Tresch (Gene Center Munich) .:. Holger Fröhlich (DKFZ Heidelberg) .:. Tim Beißbarth (Univ. Göttingen) .:. Josh Stuart, Charlie Vaske (UC SC) .:. Data integration Olga G. Troyanskaya (Princeton) .:. Edoardo Airoldi (Harvard) .:. David Blei (Princeton) .:.
  • 33. Probabilistic refinement of cellular pathway models Thank you ! Florian Markowetz florian.markowetz@cancer.org.uk