SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Genomes and phenotypes




                 Wolfgang Huber
                      EMBL
Genome Biology Unit (Heidelberg) & EBI (Cambridge)
What makes us
       different?

Genome-wide genotyping of
individuals for O(106) common
variants, by microarray, is a
commodity.
Genome sequencing also
detects rare or private
variants, and structural
variants.

Why is that useful?
Warfarin
  First use: rat and mice killer

Anticoagulant. Prevents embolism and
thrombosis




Dose requirement ~ clinical & demographic
variables;
VKORC1 (action)
CYP2C9 (metabolism)
Herceptin
Monoclonal antibody that interferes with
the ERBB2 receptor.
Tyrosin Kinase Inhibitors
                                    •Erlotinib (Tarceva)
                                    •Imatinib (Glivec)
                                    •Gefitinib (Iressa)
                                    •Dasatinib (Sprycel)
                                    •Sunitinib (Sutent)
                                    •Nilotinib (Tasigna)
                                    •Lapatinib (Tyverb)
                                    •Sorafenib (Nexavr)
                                    •Temsirolimus (Torisel)

NSCLC: resistance to TKI therapy   heterogeneity and mutational
redundancy of the disease

Identify each patient’s specific ‘driver mutations’
E.g. Activation of EGFR by exon 19 deletion or exon 21 mutation
erlotinib and gefitinib
.... etc.
Genome-wide association studies

                                               ← thousands of people →




                ← millions of genomic loci →
      ~
Y                                                        X
Genome wide association studies:
Identifiability - additive model with no
interactions
Finding important variables (loci):
impressive
Prediction performance, effect sizes: poor
Genome wide association studies:
Identifiability - additive model with no
interactions
Finding important variables (loci):
impressive
Prediction performance, effect sizes: poor

•Have we missed important variables?
(rare polymorphisms, structural variants)
Genome wide association studies:
Identifiability - additive model with no
interactions
Finding important variables (loci):
impressive
Prediction performance, effect sizes: poor

•Have we missed important variables?
(rare polymorphisms, structural variants)
•Are we overlooking variables with rare,
strong effect (sufficient but not necessary)?
Genome wide association studies:
Identifiability - additive model with no
interactions
Finding important variables (loci):
impressive
Prediction performance, effect sizes: poor

•Have we missed important variables?
(rare polymorphisms, structural variants)
•Are we overlooking variables with rare,
strong effect (sufficient but not necessary)?
•Interactions (epistasis)
Genome wide association studies:
Identifiability - additive model with no
interactions
Finding important variables (loci):
impressive
Prediction performance, effect sizes: poor

•Have we missed important variables?
(rare polymorphisms, structural variants)
•Are we overlooking variables with rare,
strong effect (sufficient but not necessary)?
•Interactions (epistasis)
  Association based approaches do not
      have enough power - we need
   perturbation experiments on model
                systems
Forward genetics



                             wt




                   white
                           curly




                              bt
goal might be to provide a broad overview of the types         Fortunately, there is a rich
               RNAi: targeted depletion of a specific
                of genes involved in a less well-understood process, as
                was done in a C. elegans genome-wide screen for genes
                                                                           in C. elegans and Drosophila
                                                                           whole-animal RNAi screening
                     gene’s products (mRNA)
                involved in endocytosis10.                                 screens might already have bee
                                                                           it can still be useful to repeat th
                                                                           because a different range of h
                   Drosophila                  Humans                      ticular, lethal genes or those w
                                                                           missed in classical genetic sc
  T7                      Long dsRNA                    siRNA
         Long                                                              screen involves identifying w
         >150 bp                                                           phenotype rather than recover
         dsRNA
                            >100 bp                      21 bp
                                                                              Genome-wide
                                                                           easy to spot using this approac
                                Bathing                     Transfection
                                                                              “libraries”
                                                                               The range of whole-animal
                                                                           vast. These can be simple visu
                                                                           cal defects, changes in the exp
                                                                           synthetic phenotypes, sensitivity
                                                                           small molecules, or any other
                                                                           ducible output. Biological proc
                                                                           impossible to access in cell cul
                                                                           tion or organ formation and b
Dicer,                          Dicer,                                     using whole-animal RNAi sc
                                                            RISC loading
RISC loading                    RISC loading
                                                                           occur at the level of single ce
                                                                           RNAi screening.
                                                                               Cell culture-based screens
                                                                           high-throughput screening a
                                                                             Specificity
                                                                           able for dissecting basic cellul
                                                                           to whole-animal assays, cel
                                                                             Efficiency
                                                                           comparatively reductionist and
                                                                           taken to choose the appropria
                                                                             Reproducibility
                                                                           simplest cell-based assay is a h
                                                                           assay, in which the phenotype
What do human cells do when you knock down
               each gene in turn?
with F. Fuchs, C. Budjan, Michael Boutros (DKFZ)
Genomewide RNAi library (Dharmacon, 22k siRNA-pools)
HeLa cells, incubated 48h, then fixed and stained
Microscopy readout: DNA (DAPI), tubulin (Alexa), actin (TRITC)




                                              CD3EAP



                                                    Molecular Systems Biology, 2010
RNAi perturbation phenotypes are observed by
                    automated microscopy




wt-                      wt-                       wt-




CEP164                    CD3EAP                   BTDBD3


22839 wells, 4 images per well
each with DNA, tubulin, actin (1344 x 1024 pixel at 3 x 12 bit)
Segmentation




Nuclei are easy (e.g. locally adaptive threshold)
But cells touch.
How do you draw reasonable boundaries between cells?
Voronoi segmentation
Voronoi segmentation
Voronoi segmentation
Voronoi segmentation

               But we only used the
                     nuclei. The
                  boundaries are
                artificially straight.

                How can we better
               use the information
                 in the actin and
                tubulin channels?
The same in white:
     Riemann metric on the        2        2        2          2
        topographic surface
                                dr = dx + dy + g dz
                ('manifold')
                                                        g dz


Genetic Interactions
                                        dij = ω µi µj

                                         
                      
              (m, m , w) = argmin
               ˆ ˆ ˆ                            |log ddijk − w −
                                          ijk

                                                                
                               log dijk with w + mi +
                                             ˆ ˆ               mj
                                                               ˆ

                                      EBImage::propagate
                    Y = Y0 + bi xi + bij xi xj + bijk
The same in white:
     Riemann metric on the        2        2        2          2
        topographic surface
                                dr = dx + dy + g dz
                ('manifold')
                                                        g dz


Genetic Interactions
                                        dij = ω µi µj

                                         
                      
              (m, m , w) = argmin
               ˆ ˆ ˆ                            |log ddijk − w −
                                          ijk

                                                                
                               log dijk with w + mi +
                                             ˆ ˆ               mj
                                                               ˆ

                                      EBImage::propagate
                    Y = Y0 + bi xi + bij xi xj + bijk
The same in white:
     Riemann metric on the        2        2        2          2
        topographic surface
                                dr = dx + dy + g dz
                ('manifold')
                                                        g dz


Genetic Interactions
                                        dij = ω µi µj

                                         
                      
              (m, m , w) = argmin
               ˆ ˆ ˆ                            |log ddijk − w −
                                          ijk

                                                                
                               log dijk with w + mi +
                                             ˆ ˆ               mj
                                                               ˆ

                                      EBImage::propagate
                    Y = Y0 + bi xi + bij xi xj + bijk
Converting images into quantitative
              features



                           cell size               289
                           cell intensity     34.33118
                           eccentricity       0.472934
                           nucleus size       2857.356
                           DNA content        485.2710
                           actin content      0.828876
                           tubulin content    0.098647
                           actin F11          0.049594
                           actin F12          0.081746
                           actin F21          0.158817
                           actin F22          0.179339
                           tubulin F11        0.009249
                           tubulin F12        0.219697
                           …              …

                           178 features per cell
EBImage::computeFeatures
Cells are classified into predefined classes
                                                                                                                           Actin Fiber

    n                    n
                                         n                                                                           af
                                                     n                       n
                                         n                   n n                             n                                Big Cell
                     n           n                                           n
        p n                                              n
                                                                                         n                           n
                                                                                                              n
                 n           n
                                         n                                                           n                         Debris
                                                             n                           n                       n
                     n       n                   n
        bc                                                                           n
                                                                             n                               af
                                         n
             n               n                           c
                     n                                                   n
                                         n                                                   n                            Lamellipodia
                                                                                 n                       af
                                                                 c
        bc                       n           n                                               n
                     n                               n               n
n                                                n                           n                               n
                                     n                       n                                                             Metaphase
        bc                                                           n                   n       n
                         n                   n       n
 n                               n                                                           n
                                     n                           n       n
                 n           n                   n                                               m       c           n
                                                                                                                              Normal
178 features per cell
Radial-kernel SVM
Manually annotated training set of ~3000 cells
                                                                                                                          Protrusions
Accuracy: ~ 90 %
The image is now represented by a 13-dim
       vector: “phenotypic profile”


                                n              289
                                ext       34.33118
                                ecc       0.472934
                                Next      2857.356
                                Nint      485.2710
                                AtoTint   0.828876
                                NtoATsz   0.098647
                                AF %      0.049594
                                BC %      0.081746
                                C%        0.158817
                                M%        0.179339
                                LA %      0.009249
                                P%        0.219697
How do you measure
     distance and similarity
in a 13-dimensional phenotypic
          profile space?
Similarity depends on the choice and
      weighting of descriptors
stance metric learningmetric learning
             Distance                                                                                            n
                                                                                                                 ext
                                                                                                                 ecc
                                                                                                                         289
                                                                                                                         34.33118
                                                                                                                         0.472934
                                                                                                                 Next    2857.356
                                                                                                                Nint
                                                                                                                 a2i
                                                                                                                         485.2710
                                                                                                                         0.828876
                                         d(x, y) =                     |fk (xk ) − fk (yk )|                     Next2   0.098647
                                                                                                            x=   AF %    0.049594
                                                           k                                                     BC %    0.081746
                                                                                                                 C%      0.158817
                                                             1                                                   M%      0.179339
                                          fk (x) =                                                               LA %    0.009249
                                                   1 + exp (−ηk (x − αk ))                                       P%      0.219697




netic Interactions that are somehow ‘related’: EMBL STRING
 Training set: pairs of genes
             Get (η, α) by minimizing average distance between training set genes,
             keeping average distance of ω µigenes fixed.
                                    dij = all µj
                                                                                y+                     1
                          −        
                                    !−                         !+ 
                                                                       +

                        (m, m , w) = argmin
                         ˆ ˆ ˆ                                             |log ddijk − w − mi −
                                                                                  n
                                                                                                   mj | 1
             200




                   y−
                    n



                                                           ijk
             150




                          !                                        !
 Frequency




                                             log dijk with w + mi + mj
             100




                                    !
                                                           ˆ ˆ !
                                                                     ˆ
             50




                               Y = Y0 + bi xi + bij xi xj + bijk xi xj xk
             0




                              100          200       300                   400        500

                                                 n
Phenotype landscape: by graph
layout or MDS
Summary
Automated phenotyping of cells upon genetic perturbations by
microscopy and image analysis


Segmentation, feature extraction, classification, distance metric
learning, multi-dimensional scaling, clustering.


“Phenotypic map” is useful to biologists
Method is also being applied to drugs
Collaboration with Michael Boutros, German Cancer Research
Centre (Heidelberg)


Fuchs, Pau et al., Molecular Systems Biology (2010)


All data and software available at
http://www.cellmorph.org
packages EBImage and imageHTS



                                   Gregoire Pau
Focus on the analysis of genomic data
Based on R and CRAN
Six-monthly release cycle, in sync with R
Releases:
• 1.0 in March 2003 (15 packages), …,
• 2.8 in April 2011 (466 software packages)‫‏‬
What’s the added
                                value?
Complex data containers (S4 classes) for new experimental
  technologies (microarrays, sequencing) shared between
  packages - even from different authors.
metadata packages: gene annotation, pathways, genomes
experiment data packages: landmark datasets
stronger emphasis on vignette-style documentation
stricter submission review (much more could be done)
more package interdependence → releases
training courses
mailing list is amenable to software and domain (bio) questions
Push new technologies: S4, vignettes, string handling,
  computations with ranges, out-of-RAM objects
Interactive Reports
Distinguish
•interactive exploration by data analyst
•reports (presentation graphics)
Interactive Reports
Distinguish
•interactive exploration by data analyst
•reports (presentation graphics)


Everybody has a PDF reader.
Interactive Reports
Distinguish
•interactive exploration by data analyst
•reports (presentation graphics)


Everybody has a PDF reader.
Everbody has a web browser.
Interactive Reports
Distinguish
•interactive exploration by data analyst
•reports (presentation graphics)


Everybody has a PDF reader.
Everbody has a web browser.
Web browsers are turning into an operating system.
PDF viewer
HTML




       PDF viewer
arrayQualityMetrics
Reports on Quality of Microarray Datasets

effort to collect all extant, useful quality metrics for microarrays
funding by EU FP7 and by Genentech
used by public databases (EBI::ArrayExpress) to annotate their data
offerings for users




                    Example report
arrayQualityMetrics
Reports on Quality of Microarray Datasets


                •mouseover → tooltip (rendered as an
effort to collect all extant, useful quality metrics for microarrays
funding by EU FP7 andtable next to the plot)
                HTML by Genentech
used by public databases (EBI::ArrayExpress) to annotate their data
                •click → select  highlight
offerings for users
              (propagated to several plots, tables)

              •expand, collapse sections
              •use HTML elements (checkboxes) to
              control plots
                  Example report
Comments and outlook
SVG is part of HTML 5:
•linked plots and brushing
•HTML widgets as controllers (checkboxes, wheels)
SVG/HTML post-processing via the XML package

Callback processing currently in JavaScript.
Use R? On server: googleVis talk by Markus Gesmann, Diego de
Castillo; locally: browser plugin

Duncan Temple Lang’s SVGAnnotation package: works for any R
graphic (incl. base), but depends on undocumented / changeable
behavior of cairo.

Paul Murrell’s gridSVG package: cleaner and more durable
approach, based on grid graphics.
Generalisation?
arrayQualityMetrics is for microarrays

Software sees:
•a set of items (arrays)
•a set of modules that compute the sections of the report (PCA,
boxplots, scatterplots)

This could be generalised to reports on very different types of subject
matter - I will be happy to discuss this.
What makes us different?

    From Genome Wide
    Association Studies, ~400
    variants that contribute to
    common traits and diseases
    are known

    Individual and the cumulative
    effects are disappointingly
    small




2
What makes us different?

    From Genome Wide
    Association Studies, ~400
    variants that contribute to
    common traits and diseases
    are known

    Individual and the cumulative
    effects are disappointingly
                                    Epistasis, interactions
    small



                                                              ...


2
Take a step back...
Genetic interactions
• only pairwise
• for a simple phenotype
• in a simple model system
Simplest “model system”: pairwise gene knock-
  down interactions and a scalar phenotype
d                                  ΔA
                                   A            ΔB
                                                B                0
                                                               Fluc




                                   80%         62.5%          100%
Phenotype




               12.5%         25%         50%           75%     100%
Interactions




               –2                         0                           +1
               Aggravating           Interaction             Alleviating
                (negative)          score ( A,B)             (positive)
A combinatorial RNAi screen




   93 Dm kinases and phosphatases
   Each targeted by two independent dsRNA designs
   Validation of knock-down by qPCR
   96 plates (~37.000 wells)
   4.600 distinct gene pairs

            with Bernd Fischer (EMBL) and
            M. Boutros, Thomas Sandmann,
            Thomas Horn (DKFZ )

            Nature Methods 4/2011
3              01/23/11
Image analysis and feature extraction
                        (version of 2010)

 number of cells
 DAPI intensity for each cell
 DAPI area for each cell




           04/19/11
Modelling Genetic Interactions
For many phenotypes, the perturbation effects combine multiplicatively for non-
interacting genes i, j:
                                        dij = ω µi µj
... i.e. additive on a logarithmic scale

                   log dijk = w + mi + m' j +gij + εijk
                          Nij ∼ NB(λij , α(λij ))
                                 baseline         main effect        measurement error
                                                  of dsRNA j
                                                            interactiont
                        log µ
                    measurement
        (nr cells, growth rate, …) ij
                                        = s + βi0 + x βi
                                            main effect
                                                   j
                                            of dsRNA i
 €
Thus we get a matrix of interaction parameters:
 profile clustering reflects functional modules
                                                  (3
                                                 5=!
                                                  !)
                                                  #;%
                                                  #:3
                                                  ,$+1
                                                  7-#1
                                                  278/98
                                                  5#6
                                                  (+#
                                                  34
                                                  2)-
                                                  0+!1
       0.5                                        ,-./0
                                                  *+
                                                  ()
                                                  %'
                                                  #$
        #%$$(!)*%'




 gij
                                                  !




       -0.5


                                                  @)
                                                  #!
                                                  $-!)
                                                  )-?

       #                                          !-
Classification of genes by function through
               their interaction profiles
           cross-validated performance           functional prediction applied to
                  on training set                           new genes




      circle sizes ~ cross-validated posterior
      probabilities of the classifier



26             01/23/11
Classification of genes by function through
               their interactionMe Your Friends and
                                 profiles
                           Show
           cross-validated performance                  functional prediction applied to
                  on training set                I'll Tell You Who You Are
                                                                   new genes




      circle sizes ~ cross-validated posterior
      probabilities of the classifier



26             01/23/11
Genetic interactions in 3 dimensions


 Different phenotypes produce
 different sets of interactions

 For each set, significant overlap
 with known genetic interactions
 and with human interologs




14          01/23/11
Interaction matrices
number of cells         intensity        area
     nrCells                intensity     area




                  Correlation matrices
     nrCells                intensity     area
Interaction matrices
    number of cells           intensity        area
         nrCells                  intensity     area




                        Correlation matrices
         nrCells                  intensity     area

Ras/MAPK inhibitors



      JNK




             Ras/MAPK
dict the phenotype(s) for an -unseen genetic underlying
       Network learning identify the state
derlying regulatory molecular modules
                    network from the data
                                   area
                                  area     nrCells
                                          number of cells
       observed
     phenotypes (p)                                         phenotype level
     (observed)


     activity (a) of core
        hidden:
     modules (e.g. complexes,
        activation level                                    expression/
     ʻpath-waysʼ)
        network structure
     (hidden)                                               activation level
       prior knowledge
       on network structure
     binary genetic
        observed                                            gene knock down
     perturbation (g)
     (observed)                 0/1 0/1         0/1

11
Ongoing: a much bigger matrix
• Larger matrix, again Dmel2 cells
• ~1500 chromatin-related genes x 100 query genes
• full microscopic readout (4x and 20x), 3 channels:
     DAPI
     ✴

   ✴ phospho-His3 (mitosis marker)

   ✴ aTubulin (for spindle phenotypes)


• 1600 384-well plates, ~ 300.000 measurements




         ctrl dsRNA         Rho1 dsRNA        Dynein light chain dsRNA
31              01/23/11
Outlook: genetic interactions from model system
 experiments as regularisation/priors for the identification
      of genetic interactions in observational studies


                        X                       Y
← genomic loci →




                              Genetic Interactions
                                                        dij = ω µi µj

                                      ˆ ˆ ˆ
                                                                  L
                                     (m, m , w) = argmin sparsitydijk − w − mi − mj |1
                                                             |log d constraints       1
                                                          ijk      on b
                                                 log dijk with w + mi + mj
                                                               ˆ ˆ       ˆ
                      GWAS dataset


                                         Y = Y0 + bi xi + bij xi xj + bijk xi xj xk
                   ← individuals →
Summary
Quantitative, combinatorial RNAi works in metazoan cells.
Technological tour de force; data exploration, QA/QC,
normalisation and transformation....


Individual genetic interactions vs interaction profiles.
Data are high-dimensional and complicated:
• dose effects,
• different / multivariate phenotypes
• relative timing
reveal non-redundant interactions.

All data  code available from

                                  Bernd Fischer,
  Thomas Horn, Thomas Sandmann, Michael Boutros
                         Nature Methods 2011(4)
Simon Anders
                Joseph Barry
               Bernd Fischer
                Ishaan Gupta
                   Felix Klein
                Gregoire Pau
        Aleksandra Pekowska
            Paul-Theodor Pyl
             Alejandro Reyes


                Collaborators
               Lars Steinmetz
       Michael Boutros (DKFZ)
Robert Gentleman (Genentech)
                   Jan Korbel


        Michael Knop (Uni HD)
                 Jan Ellenberg
    Kathryn Lilley (Cambridge)
           Anne-Claude Gavin
           Alvis Brazma (EBI)
           Paul Bertone (EBI)
            Ewan Birney (EBI)
Ras85D and drk: concentration dependence
     strength, presence and direction of an interaction can depend on
     reagent concentration (cf. drug-drug interactions)




16             01/23/11
Sign inversion for different phenotypes




15      01/23/11
Hidden Markov Model on class labels:
            parameters summarise the data
                 Learn HMM on class labels




                            Mad2




                                                 Bub1

control
Interaction scores   Correlations
Screen Plot of Interaction Score (#cells)
                                  within
                                  screen
                               replicates
                             (cor=0.968)




                            independent
                                  daRNA
                                  designs
                              (cor=0.902)




                                between
                                  screen
                               replicates
                             (cor=0.948)




    01/23/11
Screen Plot of Read-out (Number of Cells)

                                   within
                                   screen
                                replicates
                               (cor=0.97)




                             independent
                                    dsRNA
                                   designs
                                (cor=0.90)




                                 between
                                   screen
                                replicates
                               (cor=0.95)




  01/23/11

Weitere ähnliche Inhalte

Was ist angesagt?

Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
 
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformDissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformKlaas Vandepoele
 
Glossary For Biotechnology
Glossary For BiotechnologyGlossary For Biotechnology
Glossary For Biotechnologyallyjer
 
Genetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNAGenetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNAHala AbuZied
 
Recombinant dna technology
Recombinant dna technologyRecombinant dna technology
Recombinant dna technologyPatel Pramit
 
rDNA technology Recombinant DNA and its enzymes
rDNA technology Recombinant DNA and its enzymesrDNA technology Recombinant DNA and its enzymes
rDNA technology Recombinant DNA and its enzymesPranav Joshi
 
Recombinant dna technology (rdt)
Recombinant dna technology (rdt)Recombinant dna technology (rdt)
Recombinant dna technology (rdt)Chhavi Agarwal
 
Whole genome sequencing of Bacillus subtilis a gram positive organism
Whole genome sequencing of Bacillus subtilis a gram positive organismWhole genome sequencing of Bacillus subtilis a gram positive organism
Whole genome sequencing of Bacillus subtilis a gram positive organismAshajyothi Mushineni
 
Rdt (recombinant dna technology)
Rdt (recombinant dna technology)Rdt (recombinant dna technology)
Rdt (recombinant dna technology)SARVJEET SHARMA
 
cloning and sub-cloning
cloning and sub-cloningcloning and sub-cloning
cloning and sub-cloningSandhya Talla
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodrcolatru
 
ویرایش ژنوم Genome editing tools
ویرایش ژنوم Genome editing toolsویرایش ژنوم Genome editing tools
ویرایش ژنوم Genome editing toolsshahnam azizi
 
Progress and prospects in plant genome editing
Progress and prospects in plant genome editingProgress and prospects in plant genome editing
Progress and prospects in plant genome editingAnilkumar C
 

Was ist angesagt? (20)

Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformDissecting plant genomes with the PLAZA 2.5 comparative genomics platform
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 
Glossary For Biotechnology
Glossary For BiotechnologyGlossary For Biotechnology
Glossary For Biotechnology
 
Genetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNAGenetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNA
 
DNA Cloning
DNA CloningDNA Cloning
DNA Cloning
 
Recombinant dna technology
Recombinant dna technologyRecombinant dna technology
Recombinant dna technology
 
gene cloning principles an technique
gene cloning principles an techniquegene cloning principles an technique
gene cloning principles an technique
 
Biotech 06
Biotech 06Biotech 06
Biotech 06
 
rDNA technology Recombinant DNA and its enzymes
rDNA technology Recombinant DNA and its enzymesrDNA technology Recombinant DNA and its enzymes
rDNA technology Recombinant DNA and its enzymes
 
Recombinant dna technology (rdt)
Recombinant dna technology (rdt)Recombinant dna technology (rdt)
Recombinant dna technology (rdt)
 
Whole genome sequencing of Bacillus subtilis a gram positive organism
Whole genome sequencing of Bacillus subtilis a gram positive organismWhole genome sequencing of Bacillus subtilis a gram positive organism
Whole genome sequencing of Bacillus subtilis a gram positive organism
 
Rdt (recombinant dna technology)
Rdt (recombinant dna technology)Rdt (recombinant dna technology)
Rdt (recombinant dna technology)
 
cloning and sub-cloning
cloning and sub-cloningcloning and sub-cloning
cloning and sub-cloning
 
DNA Libraries
DNA LibrariesDNA Libraries
DNA Libraries
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-good
 
rDNA technology
 rDNA  technology rDNA  technology
rDNA technology
 
ویرایش ژنوم Genome editing tools
ویرایش ژنوم Genome editing toolsویرایش ژنوم Genome editing tools
ویرایش ژنوم Genome editing tools
 
Progress and prospects in plant genome editing
Progress and prospects in plant genome editingProgress and prospects in plant genome editing
Progress and prospects in plant genome editing
 
C dna
C dnaC dna
C dna
 

Ähnlich wie useR2011 - Huber

Pb Stem Cell Engineering
Pb Stem Cell EngineeringPb Stem Cell Engineering
Pb Stem Cell EngineeringJack Crawford
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineeringMrPolko
 
ZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYPriyesh Waghmare
 
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNA
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNAPOST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNA
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNAerickmadness
 
2010 AACR 240 OncoPanel Poster
2010 AACR 240 OncoPanel Poster2010 AACR 240 OncoPanel Poster
2010 AACR 240 OncoPanel Posterkarenbernards
 
Molecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in InsectsMolecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in InsectsSaramita De Chakravarti
 
Current Trends in Molecular Biology and BioTechnology (ppt)
Current Trends in Molecular Biology and BioTechnology (ppt)Current Trends in Molecular Biology and BioTechnology (ppt)
Current Trends in Molecular Biology and BioTechnology (ppt)Perez Eric
 
Gene therapy 2010
Gene therapy 2010Gene therapy 2010
Gene therapy 2010tcha163
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earthEmmanuel Aguon
 
Gene therapy ex vivo method
Gene therapy  ex vivo methodGene therapy  ex vivo method
Gene therapy ex vivo methodakash mahadev
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 
Cell Biology Lecture #2
Cell Biology Lecture #2Cell Biology Lecture #2
Cell Biology Lecture #2Suk Namgoong
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptxberciyalgolda1
 
Fluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio npptFluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio npptGenevia Vincent
 
L11 dna__polymorphisms__mutations_and_genetic_diseases
L11  dna__polymorphisms__mutations_and_genetic_diseasesL11  dna__polymorphisms__mutations_and_genetic_diseases
L11 dna__polymorphisms__mutations_and_genetic_diseasesMUBOSScz
 

Ähnlich wie useR2011 - Huber (20)

Poster MQCB
Poster MQCBPoster MQCB
Poster MQCB
 
Pb Stem Cell Engineering
Pb Stem Cell EngineeringPb Stem Cell Engineering
Pb Stem Cell Engineering
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 
ZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGYZINC FINGER NUCLEASE TECHNOLOGY
ZINC FINGER NUCLEASE TECHNOLOGY
 
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNA
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNAPOST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNA
POST-TRANSCRIPTIONAL GENE SILENCING BY DOUBLESTRANDED RNA
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 
2010 AACR 240 OncoPanel Poster
2010 AACR 240 OncoPanel Poster2010 AACR 240 OncoPanel Poster
2010 AACR 240 OncoPanel Poster
 
Molecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in InsectsMolecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in Insects
 
Current Trends in Molecular Biology and BioTechnology (ppt)
Current Trends in Molecular Biology and BioTechnology (ppt)Current Trends in Molecular Biology and BioTechnology (ppt)
Current Trends in Molecular Biology and BioTechnology (ppt)
 
Gene therapy 2010
Gene therapy 2010Gene therapy 2010
Gene therapy 2010
 
Knock out mice
Knock out miceKnock out mice
Knock out mice
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earth
 
Gene structure and expression
Gene structure and expressionGene structure and expression
Gene structure and expression
 
Gene therapy ex vivo method
Gene therapy  ex vivo methodGene therapy  ex vivo method
Gene therapy ex vivo method
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
Antisense RNA Technology Forr Crop Improvement
Antisense RNA Technology Forr Crop ImprovementAntisense RNA Technology Forr Crop Improvement
Antisense RNA Technology Forr Crop Improvement
 
Cell Biology Lecture #2
Cell Biology Lecture #2Cell Biology Lecture #2
Cell Biology Lecture #2
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptx
 
Fluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio npptFluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio nppt
 
L11 dna__polymorphisms__mutations_and_genetic_diseases
L11  dna__polymorphisms__mutations_and_genetic_diseasesL11  dna__polymorphisms__mutations_and_genetic_diseases
L11 dna__polymorphisms__mutations_and_genetic_diseases
 

Mehr von rusersla

LA R meetup - Nov 2013 - Eric Klusman
LA R meetup - Nov 2013 - Eric KlusmanLA R meetup - Nov 2013 - Eric Klusman
LA R meetup - Nov 2013 - Eric Klusmanrusersla
 
useR2011 - Whitcher
useR2011 - WhitcheruseR2011 - Whitcher
useR2011 - Whitcherrusersla
 
useR2011 - Rougier
useR2011 - RougieruseR2011 - Rougier
useR2011 - Rougierrusersla
 
useR2011 - Gromping
useR2011 - Gromping useR2011 - Gromping
useR2011 - Gromping rusersla
 
useR2011 - Edlefsen
useR2011 - EdlefsenuseR2011 - Edlefsen
useR2011 - Edlefsenrusersla
 
Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1rusersla
 
Los Angeles R users group - July 12 2011 - Part 2
Los Angeles R users group - July 12 2011 - Part 2Los Angeles R users group - July 12 2011 - Part 2
Los Angeles R users group - July 12 2011 - Part 2rusersla
 
Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2rusersla
 
Los Angeles R users group - Dec 14 2010 - Part 1
Los Angeles R users group - Dec 14 2010 - Part 1Los Angeles R users group - Dec 14 2010 - Part 1
Los Angeles R users group - Dec 14 2010 - Part 1rusersla
 
Los Angeles R users group - Dec 14 2010 - Part 3
Los Angeles R users group - Dec 14 2010 - Part 3Los Angeles R users group - Dec 14 2010 - Part 3
Los Angeles R users group - Dec 14 2010 - Part 3rusersla
 
Los Angeles R users group - Dec 14 2010 - Part 2
Los Angeles R users group - Dec 14 2010 - Part 2Los Angeles R users group - Dec 14 2010 - Part 2
Los Angeles R users group - Dec 14 2010 - Part 2rusersla
 

Mehr von rusersla (11)

LA R meetup - Nov 2013 - Eric Klusman
LA R meetup - Nov 2013 - Eric KlusmanLA R meetup - Nov 2013 - Eric Klusman
LA R meetup - Nov 2013 - Eric Klusman
 
useR2011 - Whitcher
useR2011 - WhitcheruseR2011 - Whitcher
useR2011 - Whitcher
 
useR2011 - Rougier
useR2011 - RougieruseR2011 - Rougier
useR2011 - Rougier
 
useR2011 - Gromping
useR2011 - Gromping useR2011 - Gromping
useR2011 - Gromping
 
useR2011 - Edlefsen
useR2011 - EdlefsenuseR2011 - Edlefsen
useR2011 - Edlefsen
 
Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1
 
Los Angeles R users group - July 12 2011 - Part 2
Los Angeles R users group - July 12 2011 - Part 2Los Angeles R users group - July 12 2011 - Part 2
Los Angeles R users group - July 12 2011 - Part 2
 
Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2
 
Los Angeles R users group - Dec 14 2010 - Part 1
Los Angeles R users group - Dec 14 2010 - Part 1Los Angeles R users group - Dec 14 2010 - Part 1
Los Angeles R users group - Dec 14 2010 - Part 1
 
Los Angeles R users group - Dec 14 2010 - Part 3
Los Angeles R users group - Dec 14 2010 - Part 3Los Angeles R users group - Dec 14 2010 - Part 3
Los Angeles R users group - Dec 14 2010 - Part 3
 
Los Angeles R users group - Dec 14 2010 - Part 2
Los Angeles R users group - Dec 14 2010 - Part 2Los Angeles R users group - Dec 14 2010 - Part 2
Los Angeles R users group - Dec 14 2010 - Part 2
 

Kürzlich hochgeladen

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Kürzlich hochgeladen (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

useR2011 - Huber

  • 1. Genomes and phenotypes Wolfgang Huber EMBL Genome Biology Unit (Heidelberg) & EBI (Cambridge)
  • 2. What makes us different? Genome-wide genotyping of individuals for O(106) common variants, by microarray, is a commodity. Genome sequencing also detects rare or private variants, and structural variants. Why is that useful?
  • 3. Warfarin First use: rat and mice killer Anticoagulant. Prevents embolism and thrombosis Dose requirement ~ clinical & demographic variables; VKORC1 (action) CYP2C9 (metabolism)
  • 4. Herceptin Monoclonal antibody that interferes with the ERBB2 receptor.
  • 5. Tyrosin Kinase Inhibitors •Erlotinib (Tarceva) •Imatinib (Glivec) •Gefitinib (Iressa) •Dasatinib (Sprycel) •Sunitinib (Sutent) •Nilotinib (Tasigna) •Lapatinib (Tyverb) •Sorafenib (Nexavr) •Temsirolimus (Torisel) NSCLC: resistance to TKI therapy heterogeneity and mutational redundancy of the disease Identify each patient’s specific ‘driver mutations’ E.g. Activation of EGFR by exon 19 deletion or exon 21 mutation erlotinib and gefitinib .... etc.
  • 6. Genome-wide association studies ← thousands of people → ← millions of genomic loci → ~ Y X
  • 7. Genome wide association studies: Identifiability - additive model with no interactions Finding important variables (loci): impressive Prediction performance, effect sizes: poor
  • 8. Genome wide association studies: Identifiability - additive model with no interactions Finding important variables (loci): impressive Prediction performance, effect sizes: poor •Have we missed important variables? (rare polymorphisms, structural variants)
  • 9. Genome wide association studies: Identifiability - additive model with no interactions Finding important variables (loci): impressive Prediction performance, effect sizes: poor •Have we missed important variables? (rare polymorphisms, structural variants) •Are we overlooking variables with rare, strong effect (sufficient but not necessary)?
  • 10. Genome wide association studies: Identifiability - additive model with no interactions Finding important variables (loci): impressive Prediction performance, effect sizes: poor •Have we missed important variables? (rare polymorphisms, structural variants) •Are we overlooking variables with rare, strong effect (sufficient but not necessary)? •Interactions (epistasis)
  • 11. Genome wide association studies: Identifiability - additive model with no interactions Finding important variables (loci): impressive Prediction performance, effect sizes: poor •Have we missed important variables? (rare polymorphisms, structural variants) •Are we overlooking variables with rare, strong effect (sufficient but not necessary)? •Interactions (epistasis) Association based approaches do not have enough power - we need perturbation experiments on model systems
  • 12. Forward genetics wt white curly bt
  • 13. goal might be to provide a broad overview of the types Fortunately, there is a rich RNAi: targeted depletion of a specific of genes involved in a less well-understood process, as was done in a C. elegans genome-wide screen for genes in C. elegans and Drosophila whole-animal RNAi screening gene’s products (mRNA) involved in endocytosis10. screens might already have bee it can still be useful to repeat th because a different range of h Drosophila Humans ticular, lethal genes or those w missed in classical genetic sc T7 Long dsRNA siRNA Long screen involves identifying w >150 bp phenotype rather than recover dsRNA >100 bp 21 bp Genome-wide easy to spot using this approac Bathing Transfection “libraries” The range of whole-animal vast. These can be simple visu cal defects, changes in the exp synthetic phenotypes, sensitivity small molecules, or any other ducible output. Biological proc impossible to access in cell cul tion or organ formation and b Dicer, Dicer, using whole-animal RNAi sc RISC loading RISC loading RISC loading occur at the level of single ce RNAi screening. Cell culture-based screens high-throughput screening a Specificity able for dissecting basic cellul to whole-animal assays, cel Efficiency comparatively reductionist and taken to choose the appropria Reproducibility simplest cell-based assay is a h assay, in which the phenotype
  • 14. What do human cells do when you knock down each gene in turn? with F. Fuchs, C. Budjan, Michael Boutros (DKFZ) Genomewide RNAi library (Dharmacon, 22k siRNA-pools) HeLa cells, incubated 48h, then fixed and stained Microscopy readout: DNA (DAPI), tubulin (Alexa), actin (TRITC) CD3EAP Molecular Systems Biology, 2010
  • 15. RNAi perturbation phenotypes are observed by automated microscopy wt- wt- wt- CEP164 CD3EAP BTDBD3 22839 wells, 4 images per well each with DNA, tubulin, actin (1344 x 1024 pixel at 3 x 12 bit)
  • 16. Segmentation Nuclei are easy (e.g. locally adaptive threshold) But cells touch. How do you draw reasonable boundaries between cells?
  • 20. Voronoi segmentation But we only used the nuclei. The boundaries are artificially straight. How can we better use the information in the actin and tubulin channels?
  • 21. The same in white: Riemann metric on the 2 2 2 2 topographic surface dr = dx + dy + g dz ('manifold') g dz Genetic Interactions dij = ω µi µj (m, m , w) = argmin ˆ ˆ ˆ |log ddijk − w − ijk log dijk with w + mi + ˆ ˆ mj ˆ EBImage::propagate Y = Y0 + bi xi + bij xi xj + bijk
  • 22. The same in white: Riemann metric on the 2 2 2 2 topographic surface dr = dx + dy + g dz ('manifold') g dz Genetic Interactions dij = ω µi µj (m, m , w) = argmin ˆ ˆ ˆ |log ddijk − w − ijk log dijk with w + mi + ˆ ˆ mj ˆ EBImage::propagate Y = Y0 + bi xi + bij xi xj + bijk
  • 23. The same in white: Riemann metric on the 2 2 2 2 topographic surface dr = dx + dy + g dz ('manifold') g dz Genetic Interactions dij = ω µi µj (m, m , w) = argmin ˆ ˆ ˆ |log ddijk − w − ijk log dijk with w + mi + ˆ ˆ mj ˆ EBImage::propagate Y = Y0 + bi xi + bij xi xj + bijk
  • 24. Converting images into quantitative features cell size 289 cell intensity 34.33118 eccentricity 0.472934 nucleus size 2857.356 DNA content 485.2710 actin content 0.828876 tubulin content 0.098647 actin F11 0.049594 actin F12 0.081746 actin F21 0.158817 actin F22 0.179339 tubulin F11 0.009249 tubulin F12 0.219697 … … 178 features per cell EBImage::computeFeatures
  • 25. Cells are classified into predefined classes Actin Fiber n n n af n n n n n n Big Cell n n n p n n n n n n n n n Debris n n n n n n bc n n af n n n c n n n n Lamellipodia n af c bc n n n n n n n n n n n n Metaphase bc n n n n n n n n n n n n n n n m c n Normal 178 features per cell Radial-kernel SVM Manually annotated training set of ~3000 cells Protrusions Accuracy: ~ 90 %
  • 26. The image is now represented by a 13-dim vector: “phenotypic profile” n 289 ext 34.33118 ecc 0.472934 Next 2857.356 Nint 485.2710 AtoTint 0.828876 NtoATsz 0.098647 AF % 0.049594 BC % 0.081746 C% 0.158817 M% 0.179339 LA % 0.009249 P% 0.219697
  • 27. How do you measure distance and similarity in a 13-dimensional phenotypic profile space?
  • 28. Similarity depends on the choice and weighting of descriptors
  • 29. stance metric learningmetric learning Distance n ext ecc 289 34.33118 0.472934 Next 2857.356 Nint a2i 485.2710 0.828876 d(x, y) = |fk (xk ) − fk (yk )| Next2 0.098647 x= AF % 0.049594 k BC % 0.081746 C% 0.158817 1 M% 0.179339 fk (x) = LA % 0.009249 1 + exp (−ηk (x − αk )) P% 0.219697 netic Interactions that are somehow ‘related’: EMBL STRING Training set: pairs of genes Get (η, α) by minimizing average distance between training set genes, keeping average distance of ω µigenes fixed. dij = all µj y+ 1 − !− !+ + (m, m , w) = argmin ˆ ˆ ˆ |log ddijk − w − mi − n mj | 1 200 y− n ijk 150 ! ! Frequency log dijk with w + mi + mj 100 ! ˆ ˆ ! ˆ 50 Y = Y0 + bi xi + bij xi xj + bijk xi xj xk 0 100 200 300 400 500 n
  • 30. Phenotype landscape: by graph layout or MDS
  • 31. Summary Automated phenotyping of cells upon genetic perturbations by microscopy and image analysis Segmentation, feature extraction, classification, distance metric learning, multi-dimensional scaling, clustering. “Phenotypic map” is useful to biologists Method is also being applied to drugs
  • 32. Collaboration with Michael Boutros, German Cancer Research Centre (Heidelberg) Fuchs, Pau et al., Molecular Systems Biology (2010) All data and software available at http://www.cellmorph.org packages EBImage and imageHTS Gregoire Pau
  • 33. Focus on the analysis of genomic data Based on R and CRAN Six-monthly release cycle, in sync with R Releases: • 1.0 in March 2003 (15 packages), …, • 2.8 in April 2011 (466 software packages)‫‏‬
  • 34. What’s the added value? Complex data containers (S4 classes) for new experimental technologies (microarrays, sequencing) shared between packages - even from different authors. metadata packages: gene annotation, pathways, genomes experiment data packages: landmark datasets stronger emphasis on vignette-style documentation stricter submission review (much more could be done) more package interdependence → releases training courses mailing list is amenable to software and domain (bio) questions Push new technologies: S4, vignettes, string handling, computations with ranges, out-of-RAM objects
  • 35. Interactive Reports Distinguish •interactive exploration by data analyst •reports (presentation graphics)
  • 36. Interactive Reports Distinguish •interactive exploration by data analyst •reports (presentation graphics) Everybody has a PDF reader.
  • 37. Interactive Reports Distinguish •interactive exploration by data analyst •reports (presentation graphics) Everybody has a PDF reader. Everbody has a web browser.
  • 38. Interactive Reports Distinguish •interactive exploration by data analyst •reports (presentation graphics) Everybody has a PDF reader. Everbody has a web browser. Web browsers are turning into an operating system.
  • 40. HTML PDF viewer
  • 41. arrayQualityMetrics Reports on Quality of Microarray Datasets effort to collect all extant, useful quality metrics for microarrays funding by EU FP7 and by Genentech used by public databases (EBI::ArrayExpress) to annotate their data offerings for users Example report
  • 42. arrayQualityMetrics Reports on Quality of Microarray Datasets •mouseover → tooltip (rendered as an effort to collect all extant, useful quality metrics for microarrays funding by EU FP7 andtable next to the plot) HTML by Genentech used by public databases (EBI::ArrayExpress) to annotate their data •click → select highlight offerings for users (propagated to several plots, tables) •expand, collapse sections •use HTML elements (checkboxes) to control plots Example report
  • 43. Comments and outlook SVG is part of HTML 5: •linked plots and brushing •HTML widgets as controllers (checkboxes, wheels) SVG/HTML post-processing via the XML package Callback processing currently in JavaScript. Use R? On server: googleVis talk by Markus Gesmann, Diego de Castillo; locally: browser plugin Duncan Temple Lang’s SVGAnnotation package: works for any R graphic (incl. base), but depends on undocumented / changeable behavior of cairo. Paul Murrell’s gridSVG package: cleaner and more durable approach, based on grid graphics.
  • 44. Generalisation? arrayQualityMetrics is for microarrays Software sees: •a set of items (arrays) •a set of modules that compute the sections of the report (PCA, boxplots, scatterplots) This could be generalised to reports on very different types of subject matter - I will be happy to discuss this.
  • 45. What makes us different? From Genome Wide Association Studies, ~400 variants that contribute to common traits and diseases are known Individual and the cumulative effects are disappointingly small 2
  • 46. What makes us different? From Genome Wide Association Studies, ~400 variants that contribute to common traits and diseases are known Individual and the cumulative effects are disappointingly Epistasis, interactions small ... 2
  • 47. Take a step back... Genetic interactions • only pairwise • for a simple phenotype • in a simple model system
  • 48. Simplest “model system”: pairwise gene knock- down interactions and a scalar phenotype d ΔA A ΔB B 0 Fluc 80% 62.5% 100% Phenotype 12.5% 25% 50% 75% 100% Interactions –2 0 +1 Aggravating Interaction Alleviating (negative) score ( A,B) (positive)
  • 49. A combinatorial RNAi screen  93 Dm kinases and phosphatases  Each targeted by two independent dsRNA designs  Validation of knock-down by qPCR  96 plates (~37.000 wells)  4.600 distinct gene pairs with Bernd Fischer (EMBL) and M. Boutros, Thomas Sandmann, Thomas Horn (DKFZ ) Nature Methods 4/2011 3 01/23/11
  • 50. Image analysis and feature extraction (version of 2010)  number of cells  DAPI intensity for each cell  DAPI area for each cell 04/19/11
  • 51. Modelling Genetic Interactions For many phenotypes, the perturbation effects combine multiplicatively for non- interacting genes i, j: dij = ω µi µj ... i.e. additive on a logarithmic scale log dijk = w + mi + m' j +gij + εijk Nij ∼ NB(λij , α(λij )) baseline main effect measurement error of dsRNA j interactiont log µ measurement (nr cells, growth rate, …) ij = s + βi0 + x βi main effect j of dsRNA i €
  • 52. Thus we get a matrix of interaction parameters: profile clustering reflects functional modules (3 5=! !) #;% #:3 ,$+1 7-#1 278/98 5#6 (+# 34 2)- 0+!1 0.5 ,-./0 *+ () %' #$ #%$$(!)*%' gij ! -0.5 @) #! $-!) )-? # !-
  • 53. Classification of genes by function through their interaction profiles cross-validated performance functional prediction applied to on training set new genes circle sizes ~ cross-validated posterior probabilities of the classifier 26 01/23/11
  • 54. Classification of genes by function through their interactionMe Your Friends and profiles Show cross-validated performance functional prediction applied to on training set I'll Tell You Who You Are new genes circle sizes ~ cross-validated posterior probabilities of the classifier 26 01/23/11
  • 55. Genetic interactions in 3 dimensions Different phenotypes produce different sets of interactions For each set, significant overlap with known genetic interactions and with human interologs 14 01/23/11
  • 56. Interaction matrices number of cells intensity area nrCells intensity area Correlation matrices nrCells intensity area
  • 57. Interaction matrices number of cells intensity area nrCells intensity area Correlation matrices nrCells intensity area Ras/MAPK inhibitors JNK Ras/MAPK
  • 58. dict the phenotype(s) for an -unseen genetic underlying Network learning identify the state derlying regulatory molecular modules network from the data area area nrCells number of cells observed phenotypes (p) phenotype level (observed) activity (a) of core hidden: modules (e.g. complexes, activation level expression/ ʻpath-waysʼ) network structure (hidden) activation level prior knowledge on network structure binary genetic observed gene knock down perturbation (g) (observed) 0/1 0/1 0/1 11
  • 59. Ongoing: a much bigger matrix • Larger matrix, again Dmel2 cells • ~1500 chromatin-related genes x 100 query genes • full microscopic readout (4x and 20x), 3 channels: DAPI ✴ ✴ phospho-His3 (mitosis marker) ✴ aTubulin (for spindle phenotypes) • 1600 384-well plates, ~ 300.000 measurements ctrl dsRNA Rho1 dsRNA Dynein light chain dsRNA 31 01/23/11
  • 60. Outlook: genetic interactions from model system experiments as regularisation/priors for the identification of genetic interactions in observational studies X Y ← genomic loci → Genetic Interactions dij = ω µi µj ˆ ˆ ˆ L (m, m , w) = argmin sparsitydijk − w − mi − mj |1 |log d constraints 1 ijk on b log dijk with w + mi + mj ˆ ˆ ˆ GWAS dataset Y = Y0 + bi xi + bij xi xj + bijk xi xj xk ← individuals →
  • 61. Summary Quantitative, combinatorial RNAi works in metazoan cells. Technological tour de force; data exploration, QA/QC, normalisation and transformation.... Individual genetic interactions vs interaction profiles. Data are high-dimensional and complicated: • dose effects, • different / multivariate phenotypes • relative timing reveal non-redundant interactions. All data code available from Bernd Fischer, Thomas Horn, Thomas Sandmann, Michael Boutros Nature Methods 2011(4)
  • 62. Simon Anders Joseph Barry Bernd Fischer Ishaan Gupta Felix Klein Gregoire Pau Aleksandra Pekowska Paul-Theodor Pyl Alejandro Reyes Collaborators Lars Steinmetz Michael Boutros (DKFZ) Robert Gentleman (Genentech) Jan Korbel Michael Knop (Uni HD) Jan Ellenberg Kathryn Lilley (Cambridge) Anne-Claude Gavin Alvis Brazma (EBI) Paul Bertone (EBI) Ewan Birney (EBI)
  • 63. Ras85D and drk: concentration dependence strength, presence and direction of an interaction can depend on reagent concentration (cf. drug-drug interactions) 16 01/23/11
  • 64. Sign inversion for different phenotypes 15 01/23/11
  • 65. Hidden Markov Model on class labels: parameters summarise the data Learn HMM on class labels Mad2 Bub1 control
  • 66. Interaction scores Correlations
  • 67. Screen Plot of Interaction Score (#cells) within screen replicates (cor=0.968) independent daRNA designs (cor=0.902) between screen replicates (cor=0.948) 01/23/11
  • 68. Screen Plot of Read-out (Number of Cells) within screen replicates (cor=0.97) independent dsRNA designs (cor=0.90) between screen replicates (cor=0.95) 01/23/11