1. Healthy Environments and Consumer Safety Branch
General Principle of Toxicogenomics
Carole Yauk
Environmental Health Sciences and Research Bureau
Health Canada
2. Healthy Environments and Consumer Safety Branch
OUTLINE
1. General genomics
2. What is toxicogenomics?
g
3. Overview of microarray technologies
4. Data handling and data analysis
5.
5 Experimental Design
6. An example from our lab
7. Conclusions and needs
3. Healthy Environments and Consumer Safety Branch
Genome = all an individual organism's genes
Genomics = the study of all of the genes of a cell
or tissues at the DNA, RNA and protein level
, p
4. Genome and Epigenome
DNA sequence
DNA methylation
Histones and histone modification
Credit: Moving AHEAD with an international human epigenome project. Nature
454, 711-715
6. MicroRNAs: the newest piece of the puzzle
Helping the people Aider les Canadiens et
of Canada maintain and les Canadiennes à maintenir
improve their health et à améliorer leur santé
Controls mRNA translation by
y
mRNA degradation or
translational repression
Source: microRNAs join the p53 network — another piece in the tumour-suppression puzzle
Lin He, Xingyue He, Scott W. Lowe & Gregory J. Hannon
Nature Reviews Cancer 7, 819-822 (November 2007)
7. A single microRNAs controls many mRNA products
Healthy Environments and Consumer Safety Branch
miRNA
mRNA
mRNA
mRNA mRNA
RNA mRNA
RNA
mRNA
8. Healthy Environments and Consumer Safety Branch
Toxicogenomics
Treatment
Genome
G DNA
Response Transcriptome RNA
Disease Proteome Protein
9. FOCUS: mRNA Consumer Safety Branch
Healthy Environments and (gene expression)
Gene expression PRECEDES protein changes and toxicity
p p g y
Changes in gene expression are measurable at low doses
Gene Expression
10. Chemicals perturb gene expression
Healthy Environments and Consumer Safety Branch
Example: aryl hydrocarbon receptor agonists
Source: Miller and Ramos, Drug Metabolism Reviews, 2001
11. Genes are part of pathways that carry out cellular functions
Healthy Environments and Consumer Safety Branch
Source:
www.rndsystems.com/mini_review_detail_objectname_MR03_DNADamageResponse
.aspx
12. Identify perturbed genes and their pathways/functions
Healthy Environments and Consumer Safety Branch
Elevated and Prolonged Lead Exposure in Fisher 344 Rats Leads to Marked
Hepatic Differentially Expressed Genes. Gato and Means, 2010.
13. Associate genes with biological pathways and processes
Healthy Environments and Consumer Safety Branch
Perturbations in specific pathways lead to disease
Source: Kyoto Encyclopedia for Genes and Genomes
14. Healthy Environments and Consumer Safety Branch
Applications
• Deciphering mechanism of action (pathway analysis) of toxicant
• Response at low doses
• Revealing potentially novel health effects
• Identification of perturbed pathways – targeted follow-up
• Biomarker discovery
• Investigating assumptions in toxicology
• Predictive toxicogenomics
15. Healthy Environments and Consumer Safety Branch
Gene Expression
Analytical methods to study mRNA transcription
• Gene by gene analysis:
Northern Blotting, RT-PCR, qRT-PCR
• PUBLICATION OF GENOMES
• DNA microarrays
i
• Real-time PCR arrays
16. Healthy Environments and Consumer Safety Branch
Cells of interest
Microrray Technology
Laser Scanning
mRNA (“target”)
isolation and labelling
Hybridize & wash
Microscope slide
18. Two Environments and Consumer Safety Branch
Healthy colour reference design
Biological Sample
g p Universal Mouse
Reference RNA
f
External
RNA
control RNA
Cy3
Cy5 labelled labelled
cRNA cRNA
Array
Hybridization
Fluorescence
detection and
image
analysis
19. Expression profiling and Consumer Safety Branch GeneChips
Healthy Environments using Affymetrix
Source: Affymetrix.com
20. DNA microarrays: and Consumer Safety Branch
Healthy Environments
poor reproducibility in the
early days led to a bad rap
Publication Platforms Probe ID Validation Authors’
Conclusion
Kane et al., 2000 Operon 50mer, cDNA Sequence similarity None Agreement
Hughes et al., 2001 Agilent oligo, cDNA Sequence similarity None Agreement
Yuen et al., 2002 Affymetrix, custom cDNA Sequence similarity QRT-PCR Agreement
Kuo et al., 2002 cDNA versus Affymetrix Sequence similarity None
Kothapalli et al., 2002
h ll l Incyte cDNA Affymetrix
ff Sequence similarity
l Northern
h
Li et al., 2002 Affymetrix, Incyte cDNA Unigene or Genbank QRT-PCR
Barczak et al., 2003 Affymetrix, Operon 70mer Unigene ID None Agreement
Carter et al., 2003 Agilent 60mer, cDNA Sequence matched QRT-PCR Agreement
Wang et al., 2003
W l Custom oligo and cDNA
C l d DN Sequence similarity
l RT-PCR
RT PCR Agreement
Rogojina et al., 2003 Affymetrix, Clontech cDNA Genbank ID QRT-PCR and
Q-immunoblot
Tan et al., 2003 Agilent cDNA, Affy, Genbank ID None
Amersham 30mer
Meecham et al.,2004 Agilent cDNA, Affymetrix Sequence matched None Agreement
Mah et al., 2004 cDNA array, Affymetrix Unigene (sequence QRT-PCR
*Two different labs verified)
Järvinen et al., 2004 Affymetrix, Agilent cDNA, Unigene ID None
Custom-cDNA
21. Healthy Environments and Consumer Safety Branch
EARLY PROBLEMS
Non-specific ( i
N ifi (or incorrect) probes.
t) b
Incorrect annotation.
I t t ti
Poor printing technology.
Sub-optimal protocols.
22. Healthy Environments and Consumer Safety Branch
Problems with statistical analysis and experimental design
Leniant filtering methods for poor or low intensity spots
spots.
Incorrect probe matching across platforms.
Improper data handling (i.e. Normalization).
Incorrect statistical analysis.
Biological replication.
23. Improved reproducibility after 2004
Healthy Environments and Consumer Safety Branch
Publication Platforms Probe ID Validation Authors’ Conclusion
Yauk et al., 2004 Codelink, Agilent cDNA, Agilent Oligo, NIA cDNA, Unigene ID None Dependant on platform – good platforms
M rg n, ffym tr
Mergen, Affymetrix c rr at
correlate
Shippy et al., 2004 Affymetrix, Amersham Unigene ID Real Time RT-PCR Agreement after noise adjusted
Irizarry et al., 2005 Lab- Affymetrix (5 labs), cDNA (3 labs), 2 colour Oligo Unigene, LocusLink, Real Time RT-PCR Agreement among best performing labs
lab comparison (2 labs) RefSeq
(10 labs)
Larkin et al., 2005 Affymetrix, TIGR cDNA Sequence mapped Real Time RT-PCR Agreement
TIGR
TRC Group, 2005 5 custom cDNA, Amersham, Compugen, Agilent, Transcripts matched None Moderate Agreement (standardized
*Lab-lab comparison Affy, Operon, 2 custom Oligo using NIA mouse index protocols and data analysis required)
7 labs, 12 platforms
Pylatuik et al., 2005 Genomic Amplicon Arrays, Locus ID Northern blot Moderate agreement (signal intensity-
Operon Oligo, Affymetrix dependant)
Shi et al 2005
al., Tan et al 2003 dataset
al., Genbank Acc. No.
Acc No N/A Alternate analysis had 10X
+concordance.
Barnes et al., 2005 Affymetrix, Illumina BeadArrays Sequence matched None Agreement
using BLAST
Carter et al., 2005 Affymetrix, Stanford cDNA sequence matching None Agreement (overlapping probes)
Schlingemann et al., 2005
hl l Affymetrix, In-house long Oligo
ff h l l Unigene ID
D Real Time RT-PCR
l P Agreement
Warnat et al., 2005 6 different cDNA and oligo array studies Unigene ID N/A Agreement (more platforms better for
previously published predictive anal.)
Ali-Seyed et al., 2006 Affymetrix Promoter Analysis Real Time RT-PCR AB more sensitive/ correlated RT-PCR.
Applied Biosystems
Severgnini et al., 2006
g Affymetrix, Codelink
y LocusLink ID Real Time RT-PCR Disagreement
g
De Reyniès et al., 2006 Affymetrix, GE Healthcare (Amersham), Agilent Sequence mapped Real Time RT-PCR Moderate agreement (1 colour better
than 2 colour)
Wang et al., 2006 Applied Biosystems, Agilent Sequence matched Real Time RT-PCR Agreement
(BLAST) (1375 genes confirmed with RT-PCR)
Kuo et al., 2006 Affymetrix, Amersham Probes sequence Real Rime RT-PCR Agreement (commercial better than in-
*Lab-lab comparison
Lab lab Mergen, ABI,
Mergen ABI Custom cDNA MGH MWG Agilent,
cDNA, MGH, MWG, Agilent matched within 1 exon house, 1-colour
house 1 colour better than 2)
added* Compugen, Operon (Unigene, LocusLink,
RefSeq, Refseq exon)
Green = correlation between platforms See Yauk et al. Nucleic Acids Research, 2004
yellow = moderate correlation between platforms
red = poor correlation between platforms Yauk and Berndt, Environ Mol Mutagen 2007
24. Healthy Environments and Consumer Safety Branch
Obtaining useful information from a microarray experiment
1. Quality Control
y
2. Remove probes in background.
3. Adjust (normalize) the measurements to facilitate comparisons.
4. Select genes that are differentially expressed between
samples.
l
5. Identify the biological processes and molecular functions that
are altered
altered.
6. Place data in the context of a health outcome.
25. 1. Quality Healthy Environments andGarbageBranch Garbage out
Measures: Consumer Safety in,
A. Sample and RNA Quality
B. Array (slide) quality
• Percentage of spots with no signal
• Number of saturated spots
• Intensity Distribution
• Summary Measures of the negative
control spots
• Median Signal to Noise Ratio
• M di Brightness
Median B i ht
• External Controls
27. Healthy Environments and Consumer Safety Branch
2. Background noise: f l positives
2 B k d i false iti
• Estimate the background?
g
Local
Negative Control Spots
g p
Should we background subtract?
• Limits of Detection, Presence/Absence Calls
Flagging spots in the background
gg g p g
28. 3. Normalization: Cross-slide Consumer Safety Branch and removing bias
Healthy Environments and
comparisons
ntensities
Raw relative in
Normalized relativ intensities
ve
Array number
29. Healthy Environments and Consumer Safety Branch
4.
4 Identify genes that are affected by the treatment
• Fold change is not a statistical test
• 50,000 comparisons on one chip – adjust for multiple comparisons
• Levels of filtering to identify changing genes
1. Fold Change
2. T-tests/ANOVA
2 T t t /ANOVA
3. Permutation test
a) MAANOVA
b) Significance Analysis of Microarrays (SAM)
30. Healthy Environments and Consumer Safety Branch
5. Identify biological processes/molecular functions/pathways
that are altered and link to a potential health outcome
BIOINFORMATICS
Gene Ontology: a controlled vocabulary of terms for describing gene product
characteristics and gene product annotation data
Includes: cellular compartment
biological function
molecular process
Pathway: collection of manually drawn pathway maps representing knowledge on
the
th molecular i t
l l interaction and reaction networks
ti d ti t k
Looking for over-representation of changing genes within these groups.
32. Healthy Environments and Consumer Safety Branch
Designing an experiment to study mechanism of action
1. Adequate sample size!
q p
2. Appropriate selection of time points (e.g., early, downstream,
transformation, disease effects)
3. Appropriate selection of treatment conditions (non-toxic)
4. Appropriate tissue/cells sampled
5. Sample collection – randomization (time effects)
6. HIGH QUALITY RNA!!!
7. Randomization d i
7 R d i ti and microarray experimental d i
i t l design
8. Implementation of QA/QC
9. Appropriate normalization and filtering
10. VALIDATION WITH ALTERNATIVE TECHNOLOGIES
34. Toxicological Profiles of Cigarette Smoke
Healthy Environments and Consumer Safety Branch
Condensate
C d t
Use of high-density DNA microarrays to
Investigate pathways induced by CSC exposure
Correlation with other toxicity endpoints
C l i ih h i i d i
5 cigarette brands:
1. Export A full flavour
2. 3. Gauloises Blonde
3 Gau o ses o de
3. Player’s Light King Size
Carole Yauk and Paul White collaboration
35. Healthy Environments and Consumer Safety Branch
Cigarette smoke condensate collection and characterization
Brand # of cigarettes Total TPM TPM/cig
Smoked Yield(mg)
1 – Export A 60 1625.5 27.09
2 – Gauloises Blondes 108 1826.0 16.91
3 – Player’s Light King Size 117 1659.0 14.18
37. Toxicity/genotoxicity
Healthy Environments and Consumer Safety Branch
Phenotypic anchoring and dose selection
Toxicity – Cloning Efficiency in Muta™Mouse Lung Epithelial Cells
y g y g p
Mutagenicity – Mutations Salmonella typhimurium
Mutagenicity – Mutations in Muta™Mouse Lung Epithelial Cells
i it Micronuclei in M t ™M
Clastogenicity – Mi
Cl t l i i Muta Mouse L
Lung E ith li l C ll
Epithelial Cells
Essential to select meaningful concentrations
for microarray experiments
The Muta™Mouse
38. Healthy Environments and Consumer Safety Branch
Toxicity Profiling Via Cloning Efficiency
(LD50 Values Determined Using Probit Link
Function)
120
100
LD 50 (µg/ml media)
80
60
40
20
0
Brand 1 Brand 2 Brand 3 Brand 4 Brand 5
E p
Export A Player’s Special
y p Gauloises Player’s Plain
y Player’s Light
y Lg
Yauk et al., manuscript in preparation
39. Mutagenicity in the
Healthy Environments and Consumer Safety Branch
Ames Assay
1.2
12 Export A
Gauloises
Player's King
Mutagenic Pot. (rev/µg TPM)
1.0
0.8
0.6
P
0.4
M
0.2
0.0
TA98 YG1041
G YG5161
G
Yauk et al., manuscript in preparation
40. Healthy Environments and Consumer Safety Branch
Cigarette smoke condensate does not induce DNA
sequence mutations in the FE1 cell line
Pilot Brand Pre- S9 Dose CSC Summary
incubation ug/ml
#1 Gauloises No 0 0, High Sp MF, No
20,40,60,80 response
#2 Gauloises No 0.5%
0 5% 0,
0 High Sp MF, No
MF
20,40,60,80 response
#3 Export A Full 60min 0.5% 0, 60 Good Sp MF, No
Flavor response
#4 Export Full 15, 30, 0.5% 0,40,60,80,1 Good Sp MF, No
Flavor 60min 00 response
#5 Players Light 60 min 0.5% 0-150 No dose
response
#6 Gauloise 60 min 0.5,1,2,4% 100 No response
#7 Players No 0,1,2,4 100,150,200 No dose
Special response
#8 Players Plain No No 20-120 No response
45. Healthy Environments and Consumer Safety Branch
Final Decision for Design of Microarray Study
2 time points (early response/late response)
Early = after 6hr exposure
Late = after 4hr recovery (10hr total)
3 doses (control, low, high)
Low = 45μg TPM/mL, High 90μg TPM/mL
TPM/mL High=
5 replicates/dose (required to obtain statistical significance)
Agilent 22k toxicology arrays
g gy y
46. Microarray Analyses – Main Experiment
Healthy Environments and Consumer Safety Branch
Generated 1,365,000 data points
MAANOVA to Identify Significant Changes
Clustering and pathway analyses
Affected genes, biomarkers and
brand-specific signatures
47. Healthy Environments and Consumer Safety Branch
Summary of gene expression findings
296 known genes were up- or down-regulated relative to solvent
54 down-regulated
g
6 hours 115 genes
61 up-regulated
172 down-regulated
10 hours 254 genes
82 up-regulated
Yauk et al., manuscript in preparation
49. Large overlap among the brands
Healthy Environments and Consumer Safety Branch
e.g., 10 hours, 90 μg/ml
Export A
E t Player’s Li ht
Pl ’ Light
8
93 2
87
49 3
2
Gauloises
Yauk et al., manuscript in preparation
50. Up-regulated in Exposed
Down-regulated in Exposed
Healthy Environments and Consumer Safety Branch (Higher in Dose 90)
(Lower in Dose 90)
aurora kinase A guanine nucleotide
binding protein, b t 4
bi di t i beta
cell division cycle 20 homolog
sulfiredoxin 1 homolog
cell division cycle 2 homolog A
tetraspanin 33
cell di i i
ll division cycle associated 5
l i t d
DNA-damage inducible
DEP domain containing 1B transcript 3
F-box only protein 5 serine peptidase
inhibitor, clade E,
, ,
histone 1, H1b member 1
inner centromere protein glutathione synthetase
karyopherin ( p
y p (importin) alpha 2
) p zinc finger protein 330
polo-like kinase 1 cytochrome P450,
family 1, subfamily b,
protein regulator of cytokinesis polypeptide 1
1
Control Control 6 hours
Export A 45 μg/ml 10 hours
Gauloises Blonde 90 μg/ml
Player’s light king size
Yauk et al., manuscript in preparation
51. 10 hour Gene Ontology Analysis
Healthy Environments and Consumer Safety Branch
Benjamini
B j i i Benjamini p-
B j i i
Term p-value Term value
cell division 0.000 p53 signaling 0.015
metabolic
etabo c
mitosis 0.000 processes 0.017
regulation of
g
cell cycle 0.000 cell death 0.019
cell cycle
y DNA damage
g
process 0.000 response 0.027
regulation of
cell division 0.000
0 000 apoptosis 0.047
mitosis 0.000 Yauk et al., manuscript in preparation
52. Helping the people Aider les Canadiens et
of Canada maintain and les Canadiennes à maintenir
improve their health et à améliorer leur santé
53. Dose Trends between 6 and 10 hrs
Healthy Environments and Consumer Safety Branch
77 genes
decreasing
tensity
in expression
with dose
Normalized Int
6 hrs 10 hrs
66 genes
increasing
in expression
with d
ith dose
N
6 hrs 10 hrs
54. Screening in genetic toxicology
Healthy Environments and Consumer Safety Branch
1 year
yea 2 year
yea
Cancer
Cost:$2M/cmpd
2 year rodent cancer bioassay
Time: 3 years
Mutation
Cost: $60K/cmpd
Dominant Lethal Test
Time: 6-12 months
}
In vitro mammalian mutation
Cost: $60K/cmpd
In vivo mutation
Time: 3 months
Salmonella bacteria assays
Genomics
Cost: $10K/cmpd
Gene expression analysis Time: 1 month
HIGH CONTENT!
55. Predictive toxicogenomics:
Healthy Environments and Consumer Safety Branch
medium throughput MOA analysis
Ellinger-Ziegelbauer H, et al., Toxicology Letters 186 (2009) 36-44.
56. Increasing number of papers analyzing gene expression to support
Healthy Environments and Consumer Safety Branch
observed endpoints:
Focussed quantitative real-time PCR arrays y
57. Healthy Environments and Consumer Safety Branch
Concluding remarks on gene expression technologies
• Technologies have come a long way over the past decade
• Appropriate experimental design in combination with correct data
handling generate reproducible and reliable data
• Improved annotation and bioinformatics tools are leading to a better
ability to interpret findings
• Expression technologies are highly useful for:
• The identification of mechanisms of action
• Biomarker discovery
• Exploring potentially novel health effects
• Chemical categorization
58. Healthy Environments and Consumer Safety Branch
Needs for application to identify MOA
• Identification and validation of adverse outcome genes/pathways
(differentiating adaptive versus adverse effects)
• Increasing the database of c e ca s a a y ed
c eas g t e o chemicals analyzed
• Identification of low-dose effects
• Improved bioinformatics tools for data interpretation
• Guidelines for use of expression data in regulatory assessments