SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Probabilistic refinement of
               cellular pathway models
                         Cambridge Statistical Laboratory
                                Networks seminar series
                                             2009 Jan 21


Florian Markowetz
florian.markowetz@cancer.org.uk
What is a signaling pathway?

            Environmental
            stimuli


                                           Protein
 Receptor in
 cell membrane


                            Pat
                                hw
                                           mRNA
 Protein cascade
                               ay
 Transcription factors
 regulating target genes             DNA
Pathway reconstruction
Signaling pathways are important
- Deregulation causes many diseases incl. cancer
Signaling pathways are poorly understood
- Only parts-lists
- missing are interactions within and between pathways
Biological research
- So far mostly focused on individual genes
New genome-scale datasets
- Opportunity for data integration and novel methods
What data do we have?
            Proteins:
            - interactions between proteins
                                                    Bulk of data:
            - binding to DNA
                                                     Microarray

                               mRNA:
          Protein
                               - Expression under
                               different stimuli
                               - binding to DNA
          mRNA
                                    Sequence:
                                    - binding motifs
                                    - epigenetic marks
    DNA
                               Morphology
Pathways as graphs
   • Nodes are (mostly) known
   • Goal: infer edges from data
   • Data are heterogeneous
                     • co-expression between
      Edges          genes
                     • interactions between
                     proteins
                     • binding motifs at genes
                     • binding of proteins to
      Nodes          • Protein domains
                     DNA
                     • Functional annotation
               • Cause-effect data:
      Paths    • changing environments
               • experimental perturbations
Pathway reconstruction
“Classical” statistical approaches:
Treat the genes/proteins as random variables and
   explore correlation structure in the data:
   – Correlation graphs
   – Gaussian graphical models (partial correlation)
   – Bayesian networks

Challenges/Problems/Opportunities
1. Correlation may be un-informative
2. Integrate heterogeneous and noisy and
                      complementary data sources
                               Review: Markowetz and Spang (2007)
– Part 1 –

Nested Effects Models
Experimental perturbations
                                      Drugs
 Small
 molecules
                                              RNAi
                            Protein
 Stress

                                              Knockout
                            mRNA




                     DNA


Readout:
Global gene expression measurements
Drosophila immune response
Columns: perturbed genes
Rows: effects on other genes

1. Silencing tak1 reduces
   expression of all LPS-
   inducible transcripts
2. Silencing rel (key) or
   mkk4/hep reduces
   expression of subsets of
   induced transcripts

(Boutros et al, Dev Cell 2002)
(!) Two types of entities

  Components of signaling
    pathway which are
    experimentally
    perturbed



  Downstream effect
    reporters
(!!) Only indirect information

No direct observation of
 perturbation effects on
 other pathway
 components!


Inference from observed
  perturbation effects on
  downstream reporters.
The information gap

Direct information:            Indirect information:
effects are visible at other   effects are only visible at
pathway components             down-stream reporters
Pathway                        Pathway
          B                          B
                    D                                D
                                 A               C
  A           C

                                   - Cell survival or death
                                   - Growth rate
                                   - downstream genes
Correlation won’t do
                       “Classical” approach
Pathway                             Correlation
      B            D                Graphical models:
                                    - Bayes Nets
  A           C                     - GGMs
                                    Mutual Information


                                      Nested
      Downstream
                                      Effects
       regulated
         genes
                                      Models
Nested Effects Models
                             1. Set of candidate pathway genes
INPUT
                             2. High-dimensional phenotypic profile, e.g. microarray

       Graph representation of information flow explaining
OUTPUT
       the phenotypes
                                 Phenotypic profiles      Inferred pathway
        Gene perturbations




                             A
                                                                    AB
                             B
                             C
                             D                                               EF
                                                            CD
                             E
                             F
                             G                                      GH
                             H

                                         Effects
NEM: model formulation
M’xyz:                                         Expected                Observed
                              Z
          X         Y                  X                       X           FN   FN
                                       Y                       Y      FP
                                       Z                       Z                     FN
     E1   E2   E3       E4   E5   E6       E1 E2 E3 E4 E5 E6       E1 E2 E3 E4 E5 E6


Pathway genes: X, Y, Z                     Effect reporters: E1, …, E6
• core topology                            • states are observed
• to be reconstructed                           = Data D
    = Model M                              • positions in pathway unknown
                                               = Parameters θ
                                              Marginal likelihood
Posterior: P ( M | D ) = 1/Z . P( D | M ) . P( M )
Likelihood P( D | M, θ )

              Compare predictions with observations:
     Y
                    Prediction             E1=0       E2=1
X        Z
                    Observation         1. E1=1       E2=1
                                        2. E1=0       E2=1
E1       E2

Error probabilities
        e.g. false NEG rate 20%, false POS rate 5%
 Lik = Pr( E1 = 1) ⋅ Pr( E2 = 1) ⋅ Pr( E1 = 0) ⋅ Pr( E2 = 1)
     = 0.05 ⋅ 0.95 ⋅ 0.80 ⋅ 0.95
Marginal likelihood

 P ( D | M ) = ∫ P ( D | M , Θ ) P (Θ | M ) dΘ
                      m         l
                           n
               1
                    ∏∑∏ P(e                 | M ,θ i = j )
             =m                        ik
              n      i =1 j =1 k =1
Uniform
prior over
positions
                                             Distribution of
                                             single effect
Product over
                                Product over reporter with
all effect   Average over
             possible positions replicate    known position
reporters
                                observation
             in the pathway
NEM: inference
Model space: all transitively closed directed graphs
Exhaustive enumeration: score all models to find
  the one fitting the data best
               Markowetz et al. Bioinformatics, 2005
MCMC, Simulated Annealing: take small
 probabilistic steps to explore model space
                    . . . with A Tresch; in preparation
Divide and conquer: break a big model into smaller,
  manageable pieces and then re-assemble
                       Markowetz et al. ISMB 2007
NEM: extensions




                               Likelihood based on
Drop transitivity
  requirement                  log-ratios of effects


                    Feature selection to concentrate on
                    informative effect reporters



                           Tresch and Markowetz (2008)
NEMs on Drosophila data
Summary of part 1

1. Gene perturbation screens with gene-
   expression readouts
2. Perturbation screens suffer from the
   information gap between pathways and
   reporters
3. Nested Effects Models reconstruct pathway
   features from subset relations between
   observed effects
– Part 2 –

      Data integration and
   probabilistic refinement of
a signaling pathway hypothesis
Pathway refinement
     1. Start from given pathway hypothesis
      Even if our understanding of pathways is poor, that does
                                not mean we have none at all!
     2. Evaluate evidence for hypothesis in
        data
     3. Identify weakly supported areas and
        likely extensions
     Not reconstruction from scratch.
     Step 1: assemble pathway hypothesis
        (KEGG, literature, …) for pheromone
        response pathway in Yeast
Edge data I
              Support for hypothesis in
      protein-protein interaction data
Edge data II
          Support for hypothesis in
              co-expression data
Edge data III
   Why is it so hard to reconstruct
   nuclear regulatory network from
   correlations?
Edge data IV
               Support for hypothesis in
                 TF-DNA binding data
Paths: cause-effect data
         Expression profiling of knock-out mutants
                              (Hughes et al., 2000)




              Result:
              transcriptional response to perturbation
              only visible on down-stream genes
              (information gap!)
Conclusion from data analysis

• Every data source is informative for a specific
  compartment of the pathway
• No data source is informative in all
  compartments
• We expect these observations also to hold for
  other MAPK and signaling pathways.

Need compartment-specific integrative model
 encompassing edge, node, and path data.
Integrative model
                                Conditional distributions
                                for each data type
   Pathway graph as
   hidden/latent
   variables



        Prior                                Parameters




Graphical model defines
                                Different data types contribute
posterior P(G|data)
                                to each compartment
-> inference by Gibbs sampler
Evaluation

1. Fit model parameters on pheromone
   response pathway (training)
2. Use fitted model on other MAPK pathways
   (generalization to closely related examples)
3. Use fitted model on all other Yeast signaling
   pathways (generalization to everything else)

            … work in progress …
Acknowledgements
Nested Effects Models
Rainer Spang (Univ. Regensburg) .:. Dennis
 Kostka (UC SF) .:. Achim Tresch (Gene Center
 Munich) .:. Holger Fröhlich (DKFZ Heidelberg)
 .:. Tim Beißbarth (Univ. Göttingen) .:. Josh
 Stuart, Charlie Vaske (UC SC) .:.
Data integration
Olga G. Troyanskaya (Princeton) .:. Edoardo
 Airoldi (Harvard) .:. David Blei (Princeton) .:.
Probabilistic refinement of
            cellular pathway models


        Thank you !
Florian Markowetz
florian.markowetz@cancer.org.uk

Weitere ähnliche Inhalte

Was ist angesagt? (10)

P bluescript
P bluescriptP bluescript
P bluescript
 
P 53 Tumour Biology
P 53 Tumour BiologyP 53 Tumour Biology
P 53 Tumour Biology
 
P uc vectors
P uc vectorsP uc vectors
P uc vectors
 
Bacteriophage based vector
Bacteriophage based vectorBacteriophage based vector
Bacteriophage based vector
 
Natalia Cucu Simp 09
Natalia Cucu Simp 09Natalia Cucu Simp 09
Natalia Cucu Simp 09
 
The Molecular Genetics Of Immunoglobulins
The Molecular Genetics Of ImmunoglobulinsThe Molecular Genetics Of Immunoglobulins
The Molecular Genetics Of Immunoglobulins
 
Lecture on pUC18 vector
Lecture on pUC18 vectorLecture on pUC18 vector
Lecture on pUC18 vector
 
Derivatives of pBR322
Derivatives of pBR322Derivatives of pBR322
Derivatives of pBR322
 
pUC18 vector
pUC18 vector pUC18 vector
pUC18 vector
 
P br322
P br322P br322
P br322
 

Ähnlich wie Probabilistic refinement of cellular pathway models

Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model librarylaserxiong
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisUSD Bioinformatics
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartAlexander Pico
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Vall d'Hebron Institute of Research (VHIR)
 
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisExploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisMaté Ongenaert
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Sage Base
 
Software for SBML Today
Software for SBML TodaySoftware for SBML Today
Software for SBML TodayMike Hucka
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Sage Base
 
Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Sage Base
 
Next Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert PosterNext Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert Postersptaylor
 
Next Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert SyndromeNext Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert Syndromesptaylor
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Sage Base
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Sage Base
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
Genomic selection in Livestock
Genomic  selection in LivestockGenomic  selection in Livestock
Genomic selection in LivestockILRI
 

Ähnlich wie Probabilistic refinement of cellular pathway models (20)

Pradeep.ii
Pradeep.iiPradeep.ii
Pradeep.ii
 
Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model library
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
NetBioSIG2012 joshstuart
NetBioSIG2012 joshstuartNetBioSIG2012 joshstuart
NetBioSIG2012 joshstuart
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...
 
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosisExploring the neuroblastoma epigenome: perspectives for improved prognosis
Exploring the neuroblastoma epigenome: perspectives for improved prognosis
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
 
Software for SBML Today
Software for SBML TodaySoftware for SBML Today
Software for SBML Today
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21
 
gene_concept_2.pdf
gene_concept_2.pdfgene_concept_2.pdf
gene_concept_2.pdf
 
Friend NIGM 2012-05-23
Friend NIGM 2012-05-23Friend NIGM 2012-05-23
Friend NIGM 2012-05-23
 
Next Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert PosterNext Generation Sequencing for Joubert Poster
Next Generation Sequencing for Joubert Poster
 
Next Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert SyndromeNext Generation Sequencing for Joubert Syndrome
Next Generation Sequencing for Joubert Syndrome
 
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21Stephen Friend Fanconi Anemia Research Fund 2012-01-21
Stephen Friend Fanconi Anemia Research Fund 2012-01-21
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
Genomic selection in Livestock
Genomic  selection in LivestockGenomic  selection in Livestock
Genomic selection in Livestock
 

Kürzlich hochgeladen

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 

Kürzlich hochgeladen (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 

Probabilistic refinement of cellular pathway models

  • 1. Probabilistic refinement of cellular pathway models Cambridge Statistical Laboratory Networks seminar series 2009 Jan 21 Florian Markowetz florian.markowetz@cancer.org.uk
  • 2. What is a signaling pathway? Environmental stimuli Protein Receptor in cell membrane Pat hw mRNA Protein cascade ay Transcription factors regulating target genes DNA
  • 3. Pathway reconstruction Signaling pathways are important - Deregulation causes many diseases incl. cancer Signaling pathways are poorly understood - Only parts-lists - missing are interactions within and between pathways Biological research - So far mostly focused on individual genes New genome-scale datasets - Opportunity for data integration and novel methods
  • 4. What data do we have? Proteins: - interactions between proteins Bulk of data: - binding to DNA Microarray mRNA: Protein - Expression under different stimuli - binding to DNA mRNA Sequence: - binding motifs - epigenetic marks DNA Morphology
  • 5. Pathways as graphs • Nodes are (mostly) known • Goal: infer edges from data • Data are heterogeneous • co-expression between Edges genes • interactions between proteins • binding motifs at genes • binding of proteins to Nodes • Protein domains DNA • Functional annotation • Cause-effect data: Paths • changing environments • experimental perturbations
  • 6. Pathway reconstruction “Classical” statistical approaches: Treat the genes/proteins as random variables and explore correlation structure in the data: – Correlation graphs – Gaussian graphical models (partial correlation) – Bayesian networks Challenges/Problems/Opportunities 1. Correlation may be un-informative 2. Integrate heterogeneous and noisy and complementary data sources Review: Markowetz and Spang (2007)
  • 7. – Part 1 – Nested Effects Models
  • 8. Experimental perturbations Drugs Small molecules RNAi Protein Stress Knockout mRNA DNA Readout: Global gene expression measurements
  • 9. Drosophila immune response Columns: perturbed genes Rows: effects on other genes 1. Silencing tak1 reduces expression of all LPS- inducible transcripts 2. Silencing rel (key) or mkk4/hep reduces expression of subsets of induced transcripts (Boutros et al, Dev Cell 2002)
  • 10. (!) Two types of entities Components of signaling pathway which are experimentally perturbed Downstream effect reporters
  • 11. (!!) Only indirect information No direct observation of perturbation effects on other pathway components! Inference from observed perturbation effects on downstream reporters.
  • 12. The information gap Direct information: Indirect information: effects are visible at other effects are only visible at pathway components down-stream reporters Pathway Pathway B B D D A C A C - Cell survival or death - Growth rate - downstream genes
  • 13. Correlation won’t do “Classical” approach Pathway Correlation B D Graphical models: - Bayes Nets A C - GGMs Mutual Information Nested Downstream Effects regulated genes Models
  • 14. Nested Effects Models 1. Set of candidate pathway genes INPUT 2. High-dimensional phenotypic profile, e.g. microarray Graph representation of information flow explaining OUTPUT the phenotypes Phenotypic profiles Inferred pathway Gene perturbations A AB B C D EF CD E F G GH H Effects
  • 15. NEM: model formulation M’xyz: Expected Observed Z X Y X X FN FN Y Y FP Z Z FN E1 E2 E3 E4 E5 E6 E1 E2 E3 E4 E5 E6 E1 E2 E3 E4 E5 E6 Pathway genes: X, Y, Z Effect reporters: E1, …, E6 • core topology • states are observed • to be reconstructed = Data D = Model M • positions in pathway unknown = Parameters θ Marginal likelihood Posterior: P ( M | D ) = 1/Z . P( D | M ) . P( M )
  • 16. Likelihood P( D | M, θ ) Compare predictions with observations: Y Prediction E1=0 E2=1 X Z Observation 1. E1=1 E2=1 2. E1=0 E2=1 E1 E2 Error probabilities e.g. false NEG rate 20%, false POS rate 5% Lik = Pr( E1 = 1) ⋅ Pr( E2 = 1) ⋅ Pr( E1 = 0) ⋅ Pr( E2 = 1) = 0.05 ⋅ 0.95 ⋅ 0.80 ⋅ 0.95
  • 17. Marginal likelihood P ( D | M ) = ∫ P ( D | M , Θ ) P (Θ | M ) dΘ m l n 1 ∏∑∏ P(e | M ,θ i = j ) =m ik n i =1 j =1 k =1 Uniform prior over positions Distribution of single effect Product over Product over reporter with all effect Average over possible positions replicate known position reporters observation in the pathway
  • 18. NEM: inference Model space: all transitively closed directed graphs Exhaustive enumeration: score all models to find the one fitting the data best Markowetz et al. Bioinformatics, 2005 MCMC, Simulated Annealing: take small probabilistic steps to explore model space . . . with A Tresch; in preparation Divide and conquer: break a big model into smaller, manageable pieces and then re-assemble Markowetz et al. ISMB 2007
  • 19. NEM: extensions Likelihood based on Drop transitivity requirement log-ratios of effects Feature selection to concentrate on informative effect reporters Tresch and Markowetz (2008)
  • 21. Summary of part 1 1. Gene perturbation screens with gene- expression readouts 2. Perturbation screens suffer from the information gap between pathways and reporters 3. Nested Effects Models reconstruct pathway features from subset relations between observed effects
  • 22. – Part 2 – Data integration and probabilistic refinement of a signaling pathway hypothesis
  • 23. Pathway refinement 1. Start from given pathway hypothesis Even if our understanding of pathways is poor, that does not mean we have none at all! 2. Evaluate evidence for hypothesis in data 3. Identify weakly supported areas and likely extensions Not reconstruction from scratch. Step 1: assemble pathway hypothesis (KEGG, literature, …) for pheromone response pathway in Yeast
  • 24. Edge data I Support for hypothesis in protein-protein interaction data
  • 25. Edge data II Support for hypothesis in co-expression data
  • 26. Edge data III Why is it so hard to reconstruct nuclear regulatory network from correlations?
  • 27. Edge data IV Support for hypothesis in TF-DNA binding data
  • 28. Paths: cause-effect data Expression profiling of knock-out mutants (Hughes et al., 2000) Result: transcriptional response to perturbation only visible on down-stream genes (information gap!)
  • 29. Conclusion from data analysis • Every data source is informative for a specific compartment of the pathway • No data source is informative in all compartments • We expect these observations also to hold for other MAPK and signaling pathways. Need compartment-specific integrative model encompassing edge, node, and path data.
  • 30. Integrative model Conditional distributions for each data type Pathway graph as hidden/latent variables Prior Parameters Graphical model defines Different data types contribute posterior P(G|data) to each compartment -> inference by Gibbs sampler
  • 31. Evaluation 1. Fit model parameters on pheromone response pathway (training) 2. Use fitted model on other MAPK pathways (generalization to closely related examples) 3. Use fitted model on all other Yeast signaling pathways (generalization to everything else) … work in progress …
  • 32. Acknowledgements Nested Effects Models Rainer Spang (Univ. Regensburg) .:. Dennis Kostka (UC SF) .:. Achim Tresch (Gene Center Munich) .:. Holger Fröhlich (DKFZ Heidelberg) .:. Tim Beißbarth (Univ. Göttingen) .:. Josh Stuart, Charlie Vaske (UC SC) .:. Data integration Olga G. Troyanskaya (Princeton) .:. Edoardo Airoldi (Harvard) .:. David Blei (Princeton) .:.
  • 33. Probabilistic refinement of cellular pathway models Thank you ! Florian Markowetz florian.markowetz@cancer.org.uk