2. Outline
DNA methylation
The fundamental of bisulfite sequencing technology
Current status of published BS-Seq resources
Information could be presented in a BS-Seq study
Published tools for analyzing BS-Seq data
A comprehensive BS-Seq analysis tool: MethPipe
2
5. DNA Methylation Pathway
5
Moore, L.D., Le, T. & Fan, G. DNA methylation and its basic function.
Neuropsychopharmacology 38, 23-38 (2013).
Singal, R. & Ginder, G.D. DNA Methylation. Blood Journal 93, 4059-4070 (1999).
6. DNA Demethylation Pathway
6
Moore, L.D., Le, T. & Fan, G. DNA methylation and its basic function.
Neuropsychopharmacology 38, 23-38 (2013).
• Tet:Ten-eleven translocation enzymes
• AID/ APOBEC: activation-induced cytidine
deaminase/apolipo- protein B mRNA-editing
enzyme complex
• TDG:Thymine DNA glycosylase
• SMUG1: Single-strand-selective
monofunctional uracil-DNA glycosylase 1
• 5mC: 5-Methylcytosine
• 5hmC: 5-hydroxymethyl-cytosine
• 5hmU: 5-hydroxymethyl-uracil
• 5fC: 5-formyl-cytosine
• 5caC: 5-carboxy-cytosine
7. Timeline of DNA Methylation
Analysis
7
Harrison, A. & Parle-McDermott, A. DNA methylation: a timeline of methods and applications. Front Genet 2, 74 (2011).
MS-HRM
MeDIP-Seq
BS-Seq
MethylC-Seq
TAB-Seq
9. The Steps to Determining the Methylation Status
of Cytosine in a Known DNA Sequence byThe
Bisulfite Conversion Method
9Singal, R. & Ginder, G.D. DNA Methylation. Blood Journal 93, 4059-4070 (1999).
10. Techniques for
Enrichment of
Methylated or
Target Regions
Prior to BS
Sequencing
10
Lister, R. & Ecker, J.R. Finding the fifth base:
genome-wide sequencing of cytosine
methylation. Genome Res 19, 959-66 (2009).
Genomic DNA
Deep Sequencing
Harrison, A. & Parle-McDermott, A. DNA
methylation: a timeline of methods and
applications. Front Genet 2, 74 (2011).
11. Techniques for
Genome-Wide
Sequencing of
Cytosine
Methylation Sites
11
Lister, R. & Ecker, J.R. Finding the fifth base:
genome-wide sequencing of cytosine methylation.
Genome Res 19, 959-66 (2009).
Genomic DNA
Deep Sequencing
TAB-Seq: Tet-Assisted Bs-Seq
Yu, M. et al. Tet-assisted bisulfite sequencing of 5-
hydroxymethylcytosine. Nat Protoc 7, 2159-70
(2012).
Yu, M. et al. Base-resolution analysis of 5-
hydroxymethylcytosine in the mammalian
genome. Cell 149, 1368-80 (2012).
12. Genomic Coverage of MeDIP-seq, MethylCap-seq,
RRBS and Infinium
12
Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat Biotechnol 28, 1106-14 (2010).
MeDIP-seq and MethylCap-seq provide broad cover- age of the genome, whereas
RRBS and Infinium are more restricted to CpG islands and promoter regions
13. Key Metrics of theTechnology
Comparison
13
Beck, S. Taking the measure of the methylome. Nat Biotechnol 28, 1026-8 (2010).
14. Sequencing Coverages of NGS
Platforms
14
Sims, D., Sudbery, I., Ilott, N.E., Heger, A. & Ponting, C.P. Sequencing depth and coverage: key considerations
in genomic analyses. Nat Rev Genet 15, 121-32 (2014).
15. Purified gDNA
5 mg
End RepairFragmentation
3’ End AdenylationMethylated
adapter ligation
Fragment size
selection 200-400 bp
Whole Genome Bisulfite Sequencing
Library Construction
Purify ligation
product
Library preparation using
PE sample prep kit
200-250 bp
250-300 bp
300-350 bp
Bisulfite
conversion
Zymo EZ DNA
Methylation Kit
(Qiagen EpiTec Kit)
C C
C U
Purify
3 separate tubes
for each library
3 libraries
PCR, 4 to 8
cycles
PfuTurbo Cx
Hotstart DNA
polymerase
Purify
Validate library
15
陽明大學榮陽基因體研究中心
16. Whole Genome Bisulfite Sequencing
Library Construction
16
回收 200-400 bp 片段
純化 3-5 μg 基
因體 DNA
修補端點
(End repair)DNA 斷裂
3’ End
Adenylation
C-Methylated
adapter 黏合 純化黏合後序列
使用 PE sample prep kit 進
行 Library preparation
200-250 bp
250-300 bp
300-350 bp
亞硫酸氫鹽處理
(Bisulfite
conversion)
Zymo EZ DNA
Methylation Kit
(Qiagen EpiTec Kit)
C C
C U
純化
3 separate tubes
for each library
3 libraries
PCR, 4 to 8
cycles
PfuTurbo Cx
Hotstart DNA
polymerase
純化
Validate library
定序
17. IVC (Intensity versus Cycle) Plot of
Bisulfite Sequencing
17
Library size 250 bpPhiX control
45% GC
Read 1 Read 2
% Base % Intensity
29% GC
Library size 350 bp Library size 430 bp
40% GC
22% GC
Read 1 Read 2
% Intensity
定序到
adapter
% Base
18. IVC (Intensity versus Cycle) Plot of
Bisulfite Sequencing
18
PhiX control Library size 250 bp
45% GC
Read 1 Read 2
% Base % Intensity
29% GC
Library size 350 bp Library size 430 bp
40% GC
22% GC
Read 1 Read 2
% Intensity
Reading
into adapter
% Base
19. Library size 300 bp
Library size 400 bp
Library size 500 bp
Fragment Size Effects
19
PhiX control
Reading into adapter Genomic coverage will be uneven
Read length 2x75
Amplification bias, bisulfite conversion bias, sequencing bias
DNA fragments size <
250 bp,
library size < 350 bp
(insert +121 bp)
26. Information could be Presented in a
BS-Seq Study
Sequencing depth
Coverage of
Genome length
CpG sites
Bisulfite conversion rates
Lambda virus DNA
CHG, CHH Sites (H = Not G = A, C, orT)
Statistics of methylation ratios of CpG, CHG, CHH
Methylation ratios of gene structures
Association with regulatory elements
Differential methylation region (DMR)
26
27. DNA Methylome Studies
27
Lister, R. et al. Human DNA methylomes at base
resolution show widespread epigenomic
differences. Nature 462, 315-22 (2009).
Cokus, S.J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals
DNA methylation patterning. Nature 452, 215-9 (2008).
Methylome only Methylome/Transcriptome
28. Contrast Studies
Hon, G.C. et al. Epigenetic memory at embryonic enhancers identified in DNA
methylation maps from adult mouse tissues. Nat Genet 45, 1198-206 (2013).
Lister, R. et al. Global epigenomic reconfiguration during
mammalian brain development. Science 341, 1237905 (2013).
28
17Tissues
Human/Mouse Brain Development
29. Association with Regulatory
Elements
29
Lister, R. et al. Human DNA methylomes at base
resolution show widespread epigenomic differences.
Nature 462, 315-22 (2009).
Hon, G.C. et al. Epigenetic memory at embryonic enhancers
identified in DNA methylation maps from adult mouse tissues.
Nat Genet 45, 1198-206 (2013).
30. Differential methylation region
(DMR)
30Hon, G.C. et al. Epigenetic memory at embryonic enhancers identified in DNA
methylation maps from adult mouse tissues. Nat Genet 45, 1198-206 (2013).
32. Effect and Problems of Bisulfite
Treatment of DNA
32
Krueger, F., Kreck, B., Franke, A. &
Andrews, S.R. DNA methylome
analysis using short bisulfite
sequencing data. Nat Methods 9,
145-51 (2012).
Xi, Y. & Li, W. BSMAP: whole genome bisulfite sequence MAPping
program. BMC Bioinformatics 10, 232 (2009).
Mapping bisulfite reads to 4
possible bisulfite strands
(BSW/BSWR/BSC/BSCR) is
equivalent to mapping the
bisulfite read and its reverse
complementary read to both
Watson/Crick strands of the
original reference sequence.
33. How to Align BS Reads Against Reference Genome?
33
Krueger, F. & Andrews, S.R. Bismark: A flexible aligner and methylation caller for Bisulfite-Seq
applications. Bioinformatics (2011).
. Bock, C. Analysing and interpreting DNA
methylation data. Nat Rev Genet 13, 705-19 (2012)
Y=C orT
TCGA TCGT ACGT ATGA
Multiple hits
TTGT ATGT
Multiple hits
34. Recommended
Workflow for the
Primary Analysis of BS-
Seq data
34
Krueger, F., Kreck, B., Franke, A. & Andrews, S.R. DNA methylome analysis
using short bisulfite sequencing data. Nat Methods 9, 145-51 (2012).
http://omictools.com/bisulfite-seq/
35. PublishedTools
35
Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat Biotechnol 28, 1106-14 (2010).
Krueger, F., Kreck, B., Franke, A. & Andrews, S.R. DNA methylome analysis using short bisulfite sequencing data. Nat Methods 9, 145-51 (2012).
http://omictools.com/bisulfite-seq/
B-SOLANA Bisulphite aligner for processing bisulphite-sequencing color space data http://code.google.com/p/bsolana
BatMeth Base and color space data http://code.google.com/p/batmeth
Bicycle Lister et al. 2009 workflow http://sing.ei.uvigo.es/bicycle/howitworks.html
BiQ Analyzer HT
Locus-specific analysis and visualization of high-throughput bisulfite sequencing
data
http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de
BiSeq DMR for RRBS data R/Bioconductor package BiSeq
BISMA Support analysis of repetitive sequences http://biochem.jacobs-university.de/BDPC/BISMA
Bismark
Probably the most widely used three-letter bisulphite aligner; supports both
Bowtie (fast, gap-free alignment) and Bowtie 2.0 (sensitive, gapped alignment)
http://www.bioinformatics.babraham.ac.uk/projects/bi
smark
Bis-SNP
Variant caller for inferring DNA methylation levels and genomic variants from BS-
Seq reads that have been aligned by other tools
http://epigenome.usc.edu/publicationdata/bissnp2011
Bisulfighter Using Last for mapping, HMM for DMR detection http://epigenome.cbrc.jp/bisulfighter
BRAT Highly configurable and well-documented three-letter BS-Seq aligner http://compbio.cs.ucr.edu/brat
BS-Seeker
BS-Seeker 2
Three-letter BS-Seq aligner based on Bowtie
http://pellegrini.mcdb.ucla.edu/BS_Seeker/BS_Seeker.
html
BSMAP Probably the most widely used wild-card BS-Seq aligner http://code.google.com/p/bsmap
Bsmooth Mapping, quality control and DMR analysis pipeline http://rafalab.jhsph.edu/bsmooth
COHCAP Integration with gene expression data https://sourceforge.net/projects/cohcap/
CpG_MPs Methylation patterns of genomic regions http://202.97.205.78/CpG_MPs/
DMAP DMR for BS-Seq and RRBS data
http://biochem.otago.ac.nz/research/databases-
software/
DSS Bayesian hierarchical model to detect differentially methylated loci (DML) R/Bioconductor package DSS
Epidiff DMR detection http://bioinfo.hrbmu.edu.cn/epidiff
36. PublishedTools (cont.)
36
GSNAP Wild-card BS-Seq aligner included in a widely used general-purpose alignment tool http://share.gene.com/gmap
GBSA Analysis pipeline for gene-centric or gene-independent focus http://ctrad-csi.nus.edu.sg/gbsa
FadE Mapping for Base and Color space http://code.google.com/p/fade
Kismeth Designed to be used with plants http://katahdin.mssm.edu/kismeth
Last
Recent and well-validated wild-card BS aligner included in a general-purpose
alignment tool
http://last.cbrc.jp
MethPipe Mapping, BS conversion rate, HMR, DMR pipeline http://smithlabresearch.org/software/methpipe
Methyl-MAPS
Methyl-Analyzer
Base and color space data + post analysis
http://epigenomicspub.columbia.edu/methylanalyzer
_data.html
MethylCoder
Three-letter Bs-Seq aligner that can be used with either Bowtie (high speed) or
GSNAP (high sensitivity)
https://github.com/brentp/methylcode
MethylExtract Detects variation http://bioinfo2.ugr.es/MethylExtract
MethylSig R package pipeline for BS-Seq and RRBS http://sartorlab.ccmb.med.umich.edu/software
MOABS DMR detection http://code.google.com/p/moabs
Pash Wild-card BS aligner included in a general-purpose alignment tool http://brl.bcm.tmc.edu/pash
RMAP
RMAPBS
Wild-card BS aligner included in a general-purpose alignment tool
http://www.cmb.usc.edu/people/andrewds/rmap
http://smithlabresearch.org/software/methpipe
RRBSMAP
Variant of BSMAP that is specialized on reduced-representation bisulphite
sequencing (RRBS) data
http://rrbsmap.computational-epigenetics.org
SAAP-RRBS RRBS mapping
http://ndc.mayo.edu/mayo/research/biostat/stand-
alone-packages.cfm
segemehl Wild-card bisulphite aligner included in a general-purpose alignment tool http://www.bioinf.uni-leipzig.de/Software/segemehl
SOCS-B Robin-Karp hashin, color space data http://solidsoftwaretools.com/gf/project/socs
Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat Biotechnol 28, 1106-14 (2010).
Krueger, F., Kreck, B., Franke, A. & Andrews, S.R. DNA methylome analysis using short bisulfite sequencing data. Nat Methods 9, 145-51 (2012).
http://omictools.com/bisulfite-seq/
37. How to Select a BS-Seq Analysis
Tool?
Actively update
Good supports from authors or communities
BS-Seeker 2
Bismark
Post-analysis tools
MethPipe
Kunde-Ramamoorthy, G. et al.Comparison and quantitative verification
of mapping algorithms for whole-genome bisulfite sequencing. Nucleic
Acids Res 42, e43 (2014)
Bismark (Balanced speed and genome coverage)
BSMAP (Low genome coverage)
Pash (High genome coverage, slow)
37
38. MethPipe
38
Allele-specific Methylated Regions
amrfinder allelicmeth
Differential Methylation Region
dmr
Large Hypo/Hyper-Methylation Domains
pmd
Hypo/Hyper-Methylation Regions
hmr hmr_plant pmr
Methylation Calling
methcounts
Bisulfite Conversion Rate
bsrate
Remove Duplicate Reads
duplicate-remover
Mapping
rmapbs rmapbs-pe
Quality Trimming
fastq_masker
Cross-species Comparison of Methylomes
liftOver
Calculating Methylation Ratio for Genomic RegionsbigWigAverage
OverBed
roimethstat Bwtools
Generate Methylation BED file
Bedtools bedGraphToBigWig
fastx toolkit: http://hannonlab.cshl.edu/fastx_toolkit/
MethPipe: http://smithlabresearch.org/software/methpipe/
Bedtools: https://github.com/arq5x/bedtools2
Programs from UCSCGenome Browser:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64
bwtool: https://github.com/CRG-Barcelona/bwtool/wiki
40. Analysis Of High-throughput DNA
Methylation Profiling
DNA methylation
The fundamental of bisulfite sequencing technology
Current status of published BS-Seq resources
Information could be presented in a BS-Seq study
Published tools for analyzing BS-Seq data
A comprehensive BS-Seq analysis tool: MethPipe
Questions?
40