2. Lab for Bioinformatics and
computational genomics
10 “genome hackers”
mostly engineers (statistics)
42 scientists
technicians, geneticists, clinicians
>100 people
hardware engineers,
mathematicians, molecular biologists
3. Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
4.
5.
6.
7.
8. Personalized Medicine
• The use of diagnostic tests (aka biomarkers) to identify in advance
which patients are likely to respond well to a therapy
• The benefits of this approach are to
– avoid adverse drug reactions
– improve efficacy
– adjust the dose to suit the patient
– differentiate a product in a competitive market
– meet future legal or regulatory requirements
• Potential uses of biomarkers
– Risk assessment
– Initial/early detection
– Prognosis
– Prediction/therapy selection
– Response assessment
– Monitoring for recurrence
9. Biomarker
First used in 1971 … An objective and
« predictive » measure … at the molecular
level … of normal and pathogenic processes
and responses to therapeutic interventions
Characteristic that is objectively measured and
evaluated as an indicator of normal biologic
or pathogenic processes or pharmacologic
response to a drug
A biomarker is valid if:
– It can be measured in a test system with well
established performance characteristics
– Evidence for its clinical significance has been
established
10. Rationale 1:
Why now ? Regulatory path becoming more clear
There is more at stake than
efficient drug
development. FDA
« critical path initiative »
Pharmacogenomics
guideline
Biomarkers are the
foundation of « evidence
based medicine » - who
should be treated, how
and with what.
Without Biomarkers
advances in targeted
therapy will be limited and
treatment remain largely
emperical. It is imperative
that Biomarker
development be
accelarated along with
therapeutics
11. Why now ?
First and maturing second generation molecular
profiling methodologies allow to stratify clinical
trial participants to include those most likely to
benefit from the drug candidate—and exclude
those who likely will not—pharmacogenomics-
based
Clinical trials should attain more specific results
with smaller numbers of patients. Smaller
numbers mean fewer costs (factor 2-10)
An additional benefit for trial participants and
internal review boards (IRBs) is that
stratification, given the correct biomarker, may
reduce or eliminate adverse events.
12. Molecular Profiling
The study of specific patterns (fingerprints) of proteins,
DNA, and/or mRNA and how these patterns correlate
with an individual's physical characteristics or
symptoms of disease.
13. Generic Health advice
• Exercise (Hypertrophic Cardiomyopathy)
• Drink your milk (MCM6 Lactose intolarance)
• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)
• & your grains (HLA-DQ2 – Celiac disease)
• & your iron (HFE - Hemochromatosis)
• Get more rest (HLA-DR2 - Narcolepsy)
14. Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)
• Drink your milk (MCM6 Lactose intolarance)
• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)
• & your grains (HLA-DQ2 – Celiac disease)
• & your iron (HFE - Hemochromatosis)
• Get more rest (HLA-DR2 - Narcolepsy)
15. Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)
• Drink your milk (MCM6 Lactose intolerance)
• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)
• & your grains (HLA-DQ2 – Celiac disease)
• & your iron (HFE - Hemochromatosis)
• Get more rest (HLA-DR2 - Narcolepsy)
16. Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)
• Drink your milk (MCM6 Lactose intolerance)
• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)
• & your grains (HLA-DQ2 – Celiac disease)
• & your iron (HFE - Hemochromatosis)
• Get more rest (HLA-DR2 - Narcolepsy)
26. First Generation Molecular Profiling
• Flow cytometry correlates surface markers,
cell size and other parameters
• Circulating tumor cell assays (CTC’s)
quantitate the number of tumor cells in the
peripheral blood.
• Exosomes are 30-90 nm vesicles secreted by
a wide range of mammalian cell types.
• Immunohistochemistry (IHC) measures
protein expression, usually on the cell
surface.
27.
28.
29.
30. First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection
• RT-PCR for gene expression
• FISH analysis for gene copy number
• Comparative Genome Hybridization (CGH) for
gene copy number
31. Basics of the ―old‖ technology
• Clone the DNA.
• Generate a ladder of labeled (colored)
molecules that are different by 1 nucleotide.
• Separate mixture on some matrix.
• Detect fluorochrome by laser.
• Interpret peaks as string of DNA.
• Strings are 500 to 1,000 letters long
• 1 machine generates 57,000 nucleotides/run
• Assemble all strings into a genome.
32.
33. Genetic Variation
Among People
Single nucleotide polymorphisms
(SNPs)
GATTTAGATCGCGATAGAG
GATTTAGATCTCGATAGAG
0.1% difference among
people
35. First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection
• RT-PCR for gene expression
• FISH analysis for gene copy number
• Comparative Genome Hybridization (CGH) for
gene copy number
37. First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection
• RT-PCR for gene expression
• FISH analysis for gene copy number
• Comparative Genome Hybridization (CGH) for
gene copy number
38.
39. Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
40. Basics of the ―new‖ technology
• Get DNA.
• Attach it to something.
• Extend and amplify signal with some color
scheme.
• Detect fluorochrome by microscopy.
• Interpret series of spots as short strings of
DNA.
• Strings are 30-300 letters long
• Multiple images are interpreted as 0.4 to 1.2
GB/run (1,200,000,000 letters/day).
• Map or align strings to one or many genome.
47. Read Length is Not As Important For Resequencing
100%
% of Paired K-mers with Uniquely
90%
80%
Assignable Location
70%
60%
E.COLI
50%
HUMAN
40%
30%
20%
10%
0%
8 10 12 14 16 18 20
Length of K-mer Reads (bp)
Jay Shendure
57. Second Generation DNA profiling
• Enrichment Sequencing
• ChIP-Seq (Chromosome
Immunoprecipitation)
• A substitute for ChIP-chip
• Eg. to find the binding sequence of
proteins (TFBS)
58. Paired End Reads are Important!
Known Distance
Repetitive DNA
Read 1Unique DNA 2
Read
Single read maps to
multiple positions
59. Paired End Reads are Important!
Known Distance
Repetitive DNA
Read 1Unique DNA 2
Read
Single read maps to
multiple positions
60. Second Generation DNA profiling
• Exome Sequencing (aka known as
targeted exome capture) is an
efficient strategy to selectively
sequence the coding regions of the
genome to identify novel genes
associated with rare and common
disorders.
• 160K exons
65. Second Generation RNA profiling
Besides the 6000 protein coding-genes …
140 ribosomal RNA genes
275 transfer RNA gnes
40 small nuclear RNA genes
>100 small nucleolar genes
Function of RNA genes
pRNA in 29 rotary packaging motor (Simpson
et el. Nature 408:745-750,2000)
Cartilage-hair hypoplasmia mapped to an RNA
Contents-Schedule
(Ridanpoa et al. Cell 104:195-203,2001)
The human Prader-Willi ciritical region (Cavaille
et al. PNAS 97:14035-7, 2000)
66. Second Generation RNA profiling
RNA genes can be hard to detects
UGAGGUAGUAGGUUGUAUAGU
C.elegans let-27; 21 nt
(Pasquinelli et al. Nature 408:86-89,2000)
Often small
Sometimes multicopy and redundant
Often not polyadenylated
(not represented in ESTs)
Immune to frameshift and nonsense
mutations
No open reading frame, no codon bias
Often evolving rapidly in primary sequence
67. Second Generation RNA profiling
Although details of the methods vary, the concept
behind RNA-seq is simple:
• isolate all mRNA
• convert to cDNA using reverse transcriptase
• sequence the cDNA
• map sequences to the genome
The more times a given sequence is detected, the
more abundantly transcribed it is. If enough
sequences are generated, a comprehensive and
quantitative view of the entire transcriptome of an
organism or tissue can be obtained.
68. Second Generation RNA profiling
• Comparing to microarray
– Microarray
• Closed technology: Prior knowledge required
• Affected by pseudo-genes (homologous of real genes)
• Low sensitivity
– RNA-Seq
• Open technology: No prior knowledge required
• Not affected by pseudo-genes because exact
sequence is measured
• Other information could be yielded (SNP, Alternative
splicing)
71. Mapping Structural Variation in Humans
>1 kb segments
- Thought to be Common
12% of the genome
(Redon et al. 2006)
- Likely involved in phenotype
variation and disease
CNVs
- Until recently most methods for
detection were low resolution
(>50 kb)
79. Second Generation Protein profiling
• Proteomics MS-MS-based
exclusively in discovery mode
• Automate diagnostics assay
generation (next generation
proteomics)
• Aptamers as alternative to antibodies
• ImmunoPCR
81. Second Generation Protein profiling
• Proteomics MS-MS-based
exclusively in discovery mode
• Automate diagnostics assay
generation (next generation
proteomics)
• Aptamers as alternative to antibodies
• ImmunoPCR
82. Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
83. Defining Epigenetics
Genome
DNA Reversible changes in gene
expression/function
Without changes in DNA
Chromatin sequence
Epigenome
Can be inherited from
precursor cells
Gene Expression Allows to integrate intrinsic
with environmental signals
Phenotype
(including diet)
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
85. Epigenetic Regulation:
Post Translational Modifications to Histones and Base Changes in DNA
Epigenetic modifications of histones and DNA include:
– Histone acetylation and methylation, and DNA methylation
Histone
Methylation
Me Me
Histone
Me
Acetylation
Ac
DNA Methylation
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
86.
87. MGMT Biology
O6 Methyl-Guanine
Methyl Transferase
Essential DNA Repair Enzyme
Removes alkyl groups from damaged guanine
bases
Healthy individual:
- MGMT is an essential DNA repair enzyme
Loss of MGMT activity makes individuals susceptible
to DNA damage and prone to tumor development
Glioblastoma patient on alkylator chemotherapy:
- Patients with MGMT promoter methylation show
have longer PFS and OS with the use of alkylating
agents as chemotherapy
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
88. MGMT Promoter
Methylation Predicts
Benefit form DNA-Alkylating Chemotherapy
Post-hoc subgroup analysis of Temozolomide Clinical trial with primary glioblastoma
patients show benefit for patients with MGMT promoter methylation
Median Overall Survival
25
21.7 months
20 plus
temozolomide
15
12.7 months
10 radiotherapy
radiotherapy
5
Adapted from Hegi et al.
NEJM 2005
0 352(10):1036-8.
Non-Methylated Methylated Study with 207 patients
MGMT Gene MGMT Gene
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
89. Genome-wide methylation
by methylation sensitive restriction enzymes
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
91. MBD_Seq
Condensed Chromatin DNA Sheared
Immobilized
Methyl Binding Domain
DNA Sheared
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
92. MBD_Seq
Immobilized
Methyl binding domain
MgCl2
Next Gen Sequencing
GA Illumina: 100 million reads
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
93. Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
94. Bioinformatics, a life science discipline … management of expectations
Math
Computer Science Theoretical Biology
NP AI, Image Analysis
Datamining structure prediction (HTX)
Bioinformatics
Discovery Informatics – Computational Genomics
Interface Design Expert Annotation
Sequence Analysis (Molecular)
Informatics
Biology
Computational Biology
95. Translational Medicine: An inconvenient truth
• 1% of genome codes for proteins, however
more than 90% is transcribed
• Less than 10% of protein experimentally
measured can be ―explained‖ from the
genome
• 1 genome ? Structural variation
• > 200 Epigenomes ??
• Space/time continuum …
96. Translational Medicine: An inconvenient truth
• 1% of genome codes for proteins, however
more than 90% is transcribed
• Less than 10% of protein experimentally
measured can be ―explained‖ from the
genome
• 1 genome ? Structural variation
• > 200 Epigenomes …
• ―space/time‖ continuum