SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Next-Generation Sequencing
Sequencing
• The process of determining the primary structure of an unbranched
biopolymer.
• Usually used to refer DNA/RNA sequencing.
DNA Seqencing
• Is the process of determining the sequence of nucleotides (As, Ts, Cs,
and Gs) in a piece of DNA.
• Methods:
 First generation methods
 Sequencing by Chemical modification.
 Sequencing by Chain termination.
RNA Seqencing
• Need for RNA-seq: Sequencing DNA gives a genetic profile of an organism,
sequencing RNA reflects only the sequences that are actively expressed in the
cells.
• Limitation for direct sequencing: RNA is less stable in the cell, and also more
prone to nuclease attack experimentally.
• Method:
Extraction of RNA
Reverse transcription to form cDNA
Sequencing of cDNA
Sequencing by Chemical modification
• Developed by Allan Maxam and Walter Gilbert in 1976–1977.
• Also known as Maxam and Gilbert Method.
• Based on nucleobase-specific partial chemical modification of DNA and
subsequent cleavage of the DNA backbone at sites adjacent to the modified
nucleotides.
Base Specific Modification
G: Methylation of N7 with dimethylsulphate at pH 8.0
renders the C8 -C9 bond specifically susceptible to
cleavage by base
A+G: Piperidine formate (pH 2) weakens the
glycosidic bonds of A and G residues by protonating
N atoms in the purine rings, resulting in depurination
C+T: Hydrazine opens pyrimidine rings, which
recyclize in a five-membered form that is susceptible
to removal
C: In the presence of 1.5 M NaCl, only cytosine
reacts appreciably with hydrazine
Sequencing by Chemical modification
STEPS
Purification of the DNA
Modified DNAs may then be cleaved by hot piperidine; (CH2)5NH
at the position of the modified base.
Radioactive labeling at 5′ end (typically by a kinase reaction using
gamma-32P ATP)
Chemical treatment to generate breaks at a small proportion of
one or two of the four nucleotide bases in each of four reactions
Fragments in the four reactions are electrophoresed side by side in
denaturing acrylamide gels for size separation
Note: In all cases reactions are carried out under carefully controlled conditions to
ensure that on average only one of the target bases in each DNA molecule is modified
Sequencing by Chain termination
• Also known as Sanger sequencing.
• Developed by Frederick Sanger and colleagues in 1977.
• Based on di-deoxynucleotidetriphosphates (ddNTPs).
Sequencing by Chain termination
DNA sample is divided into four separate sequencing
reactions, containing all four of the standard deoxynucleotides
(dATP, dGTP, dCTP and dTTP) and the DNA polymerase.
To each reaction is added only one of the four
dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP)
Polymerase Chain Reaction (PCR) to amplify the template
DNA fragments are heat denatured by Snap-Chill method
Separated by size using gel electrophoresis
The DNA bands may then be visualized by autoradiography
or UV light and the DNA sequence can be directly read off
the X-ray film or gel image.
Next-Generation DNA Sequencing
• Also known as high-throughput sequencing.
• NGS is the general term used to describe a number of different modern sequencing technologies
including:
 Illumina (Solexa) sequencing
 Roche 454 sequencing
 Ion torrent: Proton / PGM sequencing
 SOLiD sequencing
.... and few others
Advantages Disadvantages
Sequencing and quicker & cheaper. The associated error rates (~0.1–15%) are higher
The read lengths generally shorter (35–700 bp for
short-read approaches)
NGS approaches
• Short-read sequencing
• Sequencing by ligation (SBL)
• Sequencing by synthesis (SBS)
• Cyclic Reversible Termination (CRT)
• Single-Nucleotide Addition (SNA)
• Long-read sequencing
• Real-time long-read sequencing
• Synthetic approaches
Sequencing by ligation (SBL)
• Involve the hybridization and ligation of labelled probe and anchor sequences to a DNA
strand.
• The probes encode one or two known bases and a series of degenerate or universal bases,
driving complementary binding between the probe and template.
• The anchor fragment encodes a known sequence that is complementary to an adapter
sequence and provides a site to initiate ligation.
• After ligation, the template is imaged and the known base or bases in the probe are
identified.
• A new cycle begins after complete removal of the anchor–probe complex or through
cleavage to remove the fluorophore and to regenerate the ligation site.
• Platforms: SOLiD and Complete Genomics
Sequencing by ligation (SBL)
• Involve the hybridization and ligation of labelled probe and anchor sequences to a DNA
strand.
• The probes encode one or two known bases and a series of degenerate or universal bases,
driving complementary binding between the probe and template.
• The anchor fragment encodes a known sequence that is complementary to an adapter
sequence and provides a site to initiate ligation.
• After ligation, the template is imaged and the known base or bases in the probe are
identified.
• A new cycle begins after complete removal of the anchor–probe complex or through
cleavage to remove the fluorophore and to regenerate the ligation site.
• Platforms: SOLiD and Complete Genomics
SBL: SOLiD
 Sequencing by Oligonucleotide Ligation and
Detection utilizes two-base-encoded probes, in which
each fluorometric signal represents a dinucleotide.
 The raw output is not directly associated with the
incorporation of a known nucleotide. Because the 16
possible dinucleotide combinations cannot be
individually associated with spectrally resolvable
fluorophores, four fluorescent signals are used, each
representing a subset of four dinucleotide
combinations.
 Each ligation signal represents one of several possible
dinucleotides, leading to the term colour-space (rather
than base-space ), which must be deconvoluted
during data analysis.
 The SOLiD sequencing procedure is composed of a
series of probe–anchor binding, ligation, imaging and
cleavage cycles to elongate the complementary strand.
 Over the course of the cycles, single-nucleotide
offsets are introduced to ensure every base in the
template strand is sequenced.
SBL: Complete Genomics
performs DNA sequencing using combinatorial
probe–anchor ligation (cPAL) or combinatorial
probe–anchor synthesis (cPAS).
In cPAL , an anchor sequence (complementary to one
of the four adaptor sequences) and a probe hybridize
to a DNA nanoball at several locations.
In each cycle, the hybridizing probe is a member of a
pool of one-base-encoded probes, in which each
probe contains a known base in a constant position
and a corresponding fluorophore.
After imaging, the entire probe–anchor complex is
removed and a new probe–anchor combination is
hybridized.
Each sub-sequent cycle utilizes a probe set with the
known base in the n + 1 position.
Further cycles in the process also use adaptors of
variable lengths and chemistries, allowing
sequencing to occur upstream and downstream of the
adaptor sequence.
The cPAS approach is a modification of cPAL
intended to increase read lengths of Complete
Genomics’ chemistry.
Sequencing by synthesis(SBS)
• SBS is a term usedto describe numerous DNA-polymerase-dependent
methods.
• SBS approaches can be classified as:
 Cyclic reversible termination (CRT)
 Single-nucleotide addition (SNA)
 Platforms: Illumina, Qiagen, 454, Ion Torrent
SBS: CRT
 CRT approaches are defined by their use of terminator molecules that are similar to those used in Sanger
sequencing, in which the ribose 3ʹ‐OH group is blocked, thus preventing elongation.
 To begin the process, a DNA template is primed by a sequence that is complementary to an adapter region,
which will initiate polymerase binding to this double-stranded DNA (dsDNA) region.
 During each cycle, a mixture of all four individually labelled and 3ʹ‐blocked deoxynucleotides (dNTPs) are
added. After the incorporation of a single dNTP to each elongating complementary strand, unbound dNTPs are
removed and the surface is imaged to identify which dNTP was incorporated at each cluster.
 The fluorophore and blocking group can then be removed and a new cycle can begin.
 Platform: Illumina, Qiagen
SBS: CRT [Illumina]
• Accounts for the largest market share for sequencing instruments.
• Illumina’s suite of instruments for short-read sequencing range from small, low-throughput benchtop units to
large ultra-high throughput instruments dedicated to population-level whole-genome sequencing (WGS).
• dNTP identification is achieved through total internal reflection fluorescence (TIRF) microscopy using either two
or four laser channels.
• In most Illumina platforms, each dNTP is bound to a single fluorophore that is specific to that base type and
requires four different imaging channels, whereas the NextSeq and Mini-Seq systems use a two-fluorophore
system .
TIRFM is a type of microscope
with which a thin region of a
specimen, usually less than
200 nanometers can be
observed.
A video of Illumina Sequencing by Synthesis is
available at https://youtu.be/fCd6B5HRaZ8
SBS: CRT [Qiagen]
• GeneReader is intended to be an all‐in‐one NGS platform, from sample preparation to analysis.
• To accomplish this, the GeneReader system is bundled with the QIAcube sample preparation system and the
Qiagen Clinical Insight platform for variant analysis.
• The GeneReader uses virtually the same approach as that used by Illumina; however, it does not aim to ensure
that each template incorporates a fluorophore labelled dNTP.
• Rather, GeneReader aims to ensure that just enough labelled dNTPs are incorporated to achieve identification.
SBS: SNA
 SNA approaches rely on a single signal to mark the incorporation of a dNTP into an elongating strand.
 Each of the four nucleotides must be added iteratively to a sequencing reaction to ensure only one dNTP is
responsible for the signal.
 This does not require the dNTPs to be blocked, as the absence of the next nucleotide in the sequencing reaction
prevents elongation.
 The exception to this is homopolymer regions where identical dNTPs are added, with sequence identification
relying on a proportional increase in the signal as multiple dNTPs are incorporated.
 Platforms: 454, Ion Torrent
SBS: SNA [454]
• The first NGS instrument developed was the 454 pyrosequencing device.
• This SNA system distributes template-bound beads into a PicoTiterPlate along with beads containing an enzyme
cocktail. As a dNTP is incorporated into a strand, an enzymatic cascade occurs, resulting in a bioluminescence
signal. Each burst of light, detected by a charge-coupled device (CCD) camera, can be attributed to the
incorporation of one or more identical dNTPs at a particular bead.
SBS: SNA [Ion Torrent]
• The Ion Torrent was the first NGS platform without optical sensing.
• Doesnot depend on enzymatic reaction to generate a signal.
• It detects the H+ ions that are released as each dNTP is incorporated.
• The resulting change in pH is detected by an integrated complementary metal-oxide--semiconductor (CMOS) and
an ion-sensitive field-effect transistor (ISFET).
• The pH change detected by the sensor is imperfectly proportional to the number of nucleotides detected, allowing
for limited accuracy in measuring homopolymer lengths.
Long-read sequencing
• Genomes are highly complex with many long repetitive elements, copy number alterations and
structural variations that are relevant to evolution, adaptation and disease.
• Many of these complex elements are so long that short-read paired-end technologies are
insufficient to resolve them.
• Long-read sequencing delivers reads in excess of several kilobases, allowing for the resolution of
these large structural features.
• Long reads can span complex or repetitive regions with a single continuous read, thus eliminating
ambiguity in the positions or size of genomic elements.
• Long reads can also be useful for transcriptomic research, as they are capable of spanning entire
mRNA transcripts, allowing researchers to identify the precise connectivity of exons and discern
gene isoforms.
• Two main technologies:
 Real-time long-read sequencing
 Synthetic approaches
Real-time long-read sequencing
• The single-molecule approaches differ from short-read approaches in
that they do not rely on a clonal population of amplified DNA
fragments to generate detectablesignal, nor do they require chemical
cycling for each dNTP added.
• Platforms: SMRT-PacBio, MinION, llumina synthetic long-read
sequencing, 10X Genomics emulsion-based system
Real-time long-read sequencing: [SMRT-PacBio]
• Uses a specialized flow cell with many thousands of individual picolitre wells
with transparent bottoms — zero-mode waveguides (ZMW).
• Polymerase is fixed at the bottom of the well and allows the DNA strand to
progress through the ZMW.
• By having a constant location of incorporation owing to the stationary enzyme, the
system can focus on a single molecule.
• dNTP incorporation on each single-molecule template per well is continuously
visualized with a laser and camera system that records the colour and duration of
emitted light as the labelled nucleotide momentarily pauses during incorporation at
the bottom of the ZMW.
• The polymerase cleaves the dNTP-bound fluorophore during incorporation,
allowing it to diffuse away from the sensor area before the next labelled dNTP is
incorporated.
• SMRT uses a unique circular template that allows each template to be sequenced
multiple times as the polymerase repeatedly traverses the circular molecule.
• Although it is difficult for DNA templates longer than ~3 kb to be sequenced
multiple times, shorter DNA templates can be sequenced many times as a function
of template length.
• These multiple passes are used to generate a consensus read of insert , known as a
circular consensus sequence (CCS).
Real-time long-read sequencing: [MinION]
• The first consumer prototype of the nanopore sequencer was made available in 2014 by Oxford Nanopore Technologies
(ONT).
• This do not monitor incorporations or hybridizations of nucleotides guided by a template DNA strand.
• Whereas other platforms use a secondary signal, light, colour or pH, nanopore sequencers directly detect the DNA
composition of a native ssDNA molecule.
• To carry out sequencing, DNA is passed through a protein pore as current is passed through the pore.
• As the DNA translocates through the action of a secondary motor protein, a voltage blockade occurs that modulates the
current passing through the pore.
• The temporal tracing of these charges is called squiggle space , and shifts in voltage are characteristic of the particular
DNA sequence in the pore, which can then be interpreted as a k‐mer.
• Rather than having 1–4 possible signals, the instrument has more than 1,000 — one for each possible k‐mer, especially
when modified bases present on native DNA are taken into account.
• The ONT MinION uses a leader hairpin library structure.
• This allows the forward DNA strand to pass through the pore, followed by a hairpin that links the two strands, and
finally the reverse strand. This generates 1D and 2D reads in which both ‘1D’ strands can be aligned to create a
consensus sequence ‘2D’ read.
Real-time long-read sequencing: [MinION]
Synthetic approaches
• The synthetic approaches do not generate actual long-reads; rather,
they are an approach to library preparation that leverages barcodes to
allow computational assembly of a larger fragment.
• Platforms: llumina synthetic long-read sequencing,10X Genomics
emulsion-based system
Synthetic approaches [llumina synthetic long-read sequencing]
• The Illumina system (formerly Moleculo) partitions DNA into a microtitre plate and does not require specialized
instrumentation.
Synthetic approaches [10X Genomics]
• 10X Genomics instruments (GemCode and Chromium) use
emulsion to partition DNA and require the use of a microfluidic
instrument to perform pre-sequencing reactions.
• With as little as 1 ng of starting material, the 10X Genomics
instruments can partition arbitrarily large DNA fragments, up to
~100 kb, into micelles called ‘GEMs’, which typically contain
≤0.3× copies of the genome and one unique barcode.
• Within each GEM, a gel bead dissolves and smaller fragments of
DNA are amplified from the original large fragments, each with a
barcode identifying the source GEM.
• After sequencing, the reads are aligned and linked together to
form a series of anchored fragments across the span of the
original fragment.
Applications
• Rapidly sequence whole genomes.
• Zoom in to deeply sequence target regions.
• Utilize RNA sequencing (RNA-Seq) to discover novel RNA variants
and splice sites, or precisely quantify mRNAs for gene expression
analysis.
• Analyze epigenetic factors such as genome-wide DNA methylation
and DNA-protein interactions.
• Sequence cancer samples to study rare somatic variants, tumor
subclones, and more.
• Study microbial diversity.
NGS Experimental Considerations
• Paired-End vs. Single-Read Sequencing
• Multiplex Sequencing
• Mate Pair Sequencing
• Deep Sequencing
• Sequencing Coverage
Paired-End vs. Single-Read Sequencing
• In single-end reading, the sequencer reads a fragment from only one end to the other, generating the
sequence of base pairs.
• In paired-end reading it starts at one read, finishes this direction at the specified read length, and then starts
another round of reading from the opposite end of the fragment..
• Paired-end sequencing facilitates detection of genomic rearrangements and repetitive sequence elements, as
well as gene fusions and novel transcripts.
• Since paired-end reads are more likely to align to a reference, the quality of the entire data set improves.
Multiplex Sequencing
• Process a large number of samples with multiplex sequencing on a high-throughput instrument.
• Sample multiplexing is a useful technique when targeting specific genomic regions or working with smaller
genomes.
• To accomplish this, individual "barcode" sequences are added to each sample so they can be distinguished
and sorted during data analysis.
• Pooling samples exponentially increases the number of samples analyzed in a single run, without drastically
increasing cost or time.
Mate Pair Sequencing
• Mate pair sequencing involves generating long-insert paired-end DNA libraries
useful for a number of sequencing applications, including:
 De novo sequencing
 Genome finishing
 Structural variant detection
 Identification of complex genomic rearrangements
• Difference from paired-end sequencing: Longer read length (>800bp)
• Method: First DNA is fragmented and fragments of a desired length (2-5 kb)
are isolated. Afterwards the ends of the DNA fragments are biotinylated (adding
Biotine). The biotinylated ends leads to a circularizing of the fragments. Then
the DNA ring is crushed into smaller fragments (400-600 bp). Biotinylated
fragments are enriched (by biotin tag) and adapters are ligated. They are then
ready for cluster generation and sequencing. The trick here is that the produced
fragment (400-600 bp) contains the ends of the original long fragment (2-5 kb)
and can be sequenced now. After sequencing you therefore get information
about the original fragment.
• Combining data from mate pair sequencing with that from short-insert paired-
end reads provides increased information for maximising sequencing coverage
across a genome
Deep Sequencing
• Deep sequencing refers to sequencing a genomic region multiple times, sometimes hundreds or even
thousands of times.
• This NGS approach allows researchers to detect rare clonal types, cells, or microbes comprising as little as
1% of the original sample.
• Deep sequencing is useful for studies in oncology, microbial genomics, and other research involving
analysis of rare cell populations.
• For example, deep sequencing is required to identify mutations within tumors, because normal cell
contamination is common in cancer samples, and the tumors themselves likely contain multiple sub-clones
of cancer cells.
Sequencing Coverage
• Sequencing coverage describes the average number of reads that align to, or "cover," known reference
bases.
• The next-generation sequencing (NGS) coverage level often determines whether variant discovery can be
made with a certain degree of confidence at particular base positions.
• Sequencing coverage requirements vary by application as well as on other factors such as size of reference
genome, gene expression level, published literature, and best practices defined by the scientific community.
• At higher levels of coverage, each base is covered by a greater number of aligned sequence reads, so base
calls can be made with a higher degree of confidence.
• Examples of sequencing coverage recommendations for some common applications include:
• For detecting human genome mutations, SNPs, and rearrangements, publications often recommend from 10× to 30× depth
of coverage, depending on the application and statistical model.
• For RNA sequencing, researchers usually think in terms of numbers of millions of reads to be sampled. Detecting rarely
expressed genes often requires an increase in the depth of coverage.
• For ChIP-Seq (chromatin immunoprecipitation sequencing), publications often recommend coverage of around 100x.
Average coverage = N * L / G
Where: G is length of the original genome, N is the number of reads, and L
is the average read length
Sequencing Coverage Histograms
• Coverage histograms are commonly used to depict the range and uniformity of sequencing coverage for an
entire data set.
• They illustrate the overall coverage distribution by displaying the number of reference bases that are
covered by mapped sequencing reads at various depths.
• Mapped read depth refers to the total number of bases sequenced and aligned at a given reference base
position (note that "mapped" and "aligned" are used interchangeably in the sequencing community).
• In a sequencing coverage histogram, the read depths are binned and displayed on the x-axis, while the total
numbers of reference bases that occupy each read depth bin are displayed on the y-axis. These can also be
written as percentages of reference bases.
Evaluating NGS Coverage
• Inter-Quartile Range (IQR): The IQR is the difference in sequencing
coverage between the 75th and 25th percentiles of the histogram. This value
is a measure of statistical variability, reflecting the non-uniformity of
coverage across the entire data set. A high IQR indicates high variation in
coverage across the genome, while a low IQR reflects more uniform
sequence coverage. In the shown histograms, the lower IQR indicates that
the histogram on the left has better sequencing coverage uniformity than that
on the right.
• Mean (Mapped) Read Depth: The mean mapped read depth (or mean read
depth) is the sum of the mapped read depths at each reference base position,
divided by the number of known bases in the reference. The mean read depth
metric indicates how many reads, on average, are likely to be aligned at a
given reference base position.
• Raw Read Depth: This is the total amount of sequence data produced by the
instrument (pre-alignment), divided by the reference genome size. Although
raw read depth is often provided by sequencing instrument vendors as a
specification, it does not take into account the efficiency of the alignment
process. If a large fraction of the raw sequencing reads are discarded during
the alignment process, the post-alignment mapped read depth can be
significantly smaller than the raw read depth.
References:
• Coming of age: ten years of next-generation sequencing technologies. Sara Goodwin, John D.
McPherson and W. Richard McCombie. Nature Reviews: Genetics. doi:10.1038/nrg.2016.49
• Illumina: https://www.illumina.com
• ecSeq Bioinformatics: https://www.ecseq.com
• Wikipedia: https://en.wikipedia.org/
• Columbia Genome Centre: https://systemsbiology.columbia.edu
Next Generation Sequencing Glossary can be found at:
 http://sabiosciences.com/NGS_Glossary.php
 http://deeptools.readthedocs.io/en/latest/content/help_glossary.html
 https://www.nextgenerationsequencing.info/ngs-introduction/ngs-glossary
 http://www.nslc.wustl.edu/elgin/genomics/tour_nextgen/glossary.pdf

Weitere ähnliche Inhalte

Was ist angesagt?

DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHODMusa Khan
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation SequencingFarid MUSA
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing pptAshwini R
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingPALANIANANTH.S
 
DNA microarray final ppt.
DNA microarray final ppt.DNA microarray final ppt.
DNA microarray final ppt.Aashish Patel
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingneelmanayab
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGBilal Nizami
 
SNP Detection Methods and applications
SNP Detection Methods and applications SNP Detection Methods and applications
SNP Detection Methods and applications Aneela Rafiq
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxRITHIKA R S
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingDayananda Salam
 
Next generation sequencing
Next  generation  sequencingNext  generation  sequencing
Next generation sequencingNidhi Singh
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencingGoutham Sarovar
 
Gene sequencing methods
Gene sequencing methodsGene sequencing methods
Gene sequencing methodsDeepak Kumar
 

Was ist angesagt? (20)

DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHOD
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing ppt
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
DNA microarray final ppt.
DNA microarray final ppt.DNA microarray final ppt.
DNA microarray final ppt.
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Labelling of dna
Labelling of dnaLabelling of dna
Labelling of dna
 
DNA Sequencing
DNA SequencingDNA Sequencing
DNA Sequencing
 
SNP Detection Methods and applications
SNP Detection Methods and applications SNP Detection Methods and applications
SNP Detection Methods and applications
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptx
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Pyrosequencing
PyrosequencingPyrosequencing
Pyrosequencing
 
Next generation sequencing
Next  generation  sequencingNext  generation  sequencing
Next generation sequencing
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
 
Gene sequencing methods
Gene sequencing methodsGene sequencing methods
Gene sequencing methods
 
Dna sequencing
Dna sequencing Dna sequencing
Dna sequencing
 

Ähnlich wie Next Generation Sequencing (20)

Dna sequencing methods
Dna sequencing methodsDna sequencing methods
Dna sequencing methods
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
PARTH SHARMA molecular diagnosis ppt.pptx
PARTH SHARMA molecular diagnosis ppt.pptxPARTH SHARMA molecular diagnosis ppt.pptx
PARTH SHARMA molecular diagnosis ppt.pptx
 
Pcr
PcrPcr
Pcr
 
Prabhakar singh ii sem-paper-dna sequencing
Prabhakar singh  ii sem-paper-dna sequencingPrabhakar singh  ii sem-paper-dna sequencing
Prabhakar singh ii sem-paper-dna sequencing
 
Dna sequening
Dna sequeningDna sequening
Dna sequening
 
Gene sequencing steps involved, methods used and applications pptx
Gene sequencing steps involved, methods used and applications pptxGene sequencing steps involved, methods used and applications pptx
Gene sequencing steps involved, methods used and applications pptx
 
DNA sequencing
DNA sequencing  DNA sequencing
DNA sequencing
 
Adv lec3
Adv lec3Adv lec3
Adv lec3
 
Sequencing
SequencingSequencing
Sequencing
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 
Dna sequencing techniques
Dna sequencing techniquesDna sequencing techniques
Dna sequencing techniques
 
Lecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.pptLecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
 
DNA sequencing
DNA sequencingDNA sequencing
DNA sequencing
 
Gene Sequencing
Gene SequencingGene Sequencing
Gene Sequencing
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 
Sanger-Shortgun sequencing.pdf
Sanger-Shortgun sequencing.pdfSanger-Shortgun sequencing.pdf
Sanger-Shortgun sequencing.pdf
 
Ligase Chain Reaction(LCR)
Ligase Chain Reaction(LCR)Ligase Chain Reaction(LCR)
Ligase Chain Reaction(LCR)
 
dna sequencing methods
 dna sequencing methods dna sequencing methods
dna sequencing methods
 

Mehr von Arindam Ghosh

Network embedding in biomedical data science
Network embedding in biomedical data scienceNetwork embedding in biomedical data science
Network embedding in biomedical data scienceArindam Ghosh
 
Pharmacogenomics & its ethical issues
Pharmacogenomics & its ethical  issuesPharmacogenomics & its ethical  issues
Pharmacogenomics & its ethical issuesArindam Ghosh
 
Limb development in vertebrates
Limb development in vertebratesLimb development in vertebrates
Limb development in vertebratesArindam Ghosh
 
Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)Arindam Ghosh
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsArindam Ghosh
 
Java - Interfaces & Packages
Java - Interfaces & PackagesJava - Interfaces & Packages
Java - Interfaces & PackagesArindam Ghosh
 
Freshers day anchoring script
Freshers day anchoring scriptFreshers day anchoring script
Freshers day anchoring scriptArindam Ghosh
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Cedrus of Himachal Pradesh
Cedrus of Himachal PradeshCedrus of Himachal Pradesh
Cedrus of Himachal PradeshArindam Ghosh
 
MySQL and bioinformatics
MySQL and bioinformatics MySQL and bioinformatics
MySQL and bioinformatics Arindam Ghosh
 
Protein sorting in mitochondria
Protein sorting in mitochondriaProtein sorting in mitochondria
Protein sorting in mitochondriaArindam Ghosh
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisArindam Ghosh
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in BioinformaticsArindam Ghosh
 

Mehr von Arindam Ghosh (19)

Network embedding in biomedical data science
Network embedding in biomedical data scienceNetwork embedding in biomedical data science
Network embedding in biomedical data science
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Pharmacogenomics & its ethical issues
Pharmacogenomics & its ethical  issuesPharmacogenomics & its ethical  issues
Pharmacogenomics & its ethical issues
 
Limb development in vertebrates
Limb development in vertebratesLimb development in vertebrates
Limb development in vertebrates
 
Canning fish
Canning fishCanning fish
Canning fish
 
Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)Polymerase Chain Reaction (PCR)
Polymerase Chain Reaction (PCR)
 
Carbon Nanotubes
Carbon NanotubesCarbon Nanotubes
Carbon Nanotubes
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
 
Java - Interfaces & Packages
Java - Interfaces & PackagesJava - Interfaces & Packages
Java - Interfaces & Packages
 
Freshers day anchoring script
Freshers day anchoring scriptFreshers day anchoring script
Freshers day anchoring script
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Artificial Vectors
Artificial VectorsArtificial Vectors
Artificial Vectors
 
Pseudo code
Pseudo codePseudo code
Pseudo code
 
Hamiltonian path
Hamiltonian pathHamiltonian path
Hamiltonian path
 
Cedrus of Himachal Pradesh
Cedrus of Himachal PradeshCedrus of Himachal Pradesh
Cedrus of Himachal Pradesh
 
MySQL and bioinformatics
MySQL and bioinformatics MySQL and bioinformatics
MySQL and bioinformatics
 
Protein sorting in mitochondria
Protein sorting in mitochondriaProtein sorting in mitochondria
Protein sorting in mitochondria
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 

Kürzlich hochgeladen

Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 

Kürzlich hochgeladen (20)

Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 

Next Generation Sequencing

  • 2. Sequencing • The process of determining the primary structure of an unbranched biopolymer. • Usually used to refer DNA/RNA sequencing.
  • 3. DNA Seqencing • Is the process of determining the sequence of nucleotides (As, Ts, Cs, and Gs) in a piece of DNA. • Methods:  First generation methods  Sequencing by Chemical modification.  Sequencing by Chain termination.
  • 4. RNA Seqencing • Need for RNA-seq: Sequencing DNA gives a genetic profile of an organism, sequencing RNA reflects only the sequences that are actively expressed in the cells. • Limitation for direct sequencing: RNA is less stable in the cell, and also more prone to nuclease attack experimentally. • Method: Extraction of RNA Reverse transcription to form cDNA Sequencing of cDNA
  • 5. Sequencing by Chemical modification • Developed by Allan Maxam and Walter Gilbert in 1976–1977. • Also known as Maxam and Gilbert Method. • Based on nucleobase-specific partial chemical modification of DNA and subsequent cleavage of the DNA backbone at sites adjacent to the modified nucleotides. Base Specific Modification G: Methylation of N7 with dimethylsulphate at pH 8.0 renders the C8 -C9 bond specifically susceptible to cleavage by base A+G: Piperidine formate (pH 2) weakens the glycosidic bonds of A and G residues by protonating N atoms in the purine rings, resulting in depurination C+T: Hydrazine opens pyrimidine rings, which recyclize in a five-membered form that is susceptible to removal C: In the presence of 1.5 M NaCl, only cytosine reacts appreciably with hydrazine
  • 6. Sequencing by Chemical modification STEPS Purification of the DNA Modified DNAs may then be cleaved by hot piperidine; (CH2)5NH at the position of the modified base. Radioactive labeling at 5′ end (typically by a kinase reaction using gamma-32P ATP) Chemical treatment to generate breaks at a small proportion of one or two of the four nucleotide bases in each of four reactions Fragments in the four reactions are electrophoresed side by side in denaturing acrylamide gels for size separation Note: In all cases reactions are carried out under carefully controlled conditions to ensure that on average only one of the target bases in each DNA molecule is modified
  • 7. Sequencing by Chain termination • Also known as Sanger sequencing. • Developed by Frederick Sanger and colleagues in 1977. • Based on di-deoxynucleotidetriphosphates (ddNTPs).
  • 8. Sequencing by Chain termination DNA sample is divided into four separate sequencing reactions, containing all four of the standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase. To each reaction is added only one of the four dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP) Polymerase Chain Reaction (PCR) to amplify the template DNA fragments are heat denatured by Snap-Chill method Separated by size using gel electrophoresis The DNA bands may then be visualized by autoradiography or UV light and the DNA sequence can be directly read off the X-ray film or gel image.
  • 9. Next-Generation DNA Sequencing • Also known as high-throughput sequencing. • NGS is the general term used to describe a number of different modern sequencing technologies including:  Illumina (Solexa) sequencing  Roche 454 sequencing  Ion torrent: Proton / PGM sequencing  SOLiD sequencing .... and few others Advantages Disadvantages Sequencing and quicker & cheaper. The associated error rates (~0.1–15%) are higher The read lengths generally shorter (35–700 bp for short-read approaches)
  • 10. NGS approaches • Short-read sequencing • Sequencing by ligation (SBL) • Sequencing by synthesis (SBS) • Cyclic Reversible Termination (CRT) • Single-Nucleotide Addition (SNA) • Long-read sequencing • Real-time long-read sequencing • Synthetic approaches
  • 11. Sequencing by ligation (SBL) • Involve the hybridization and ligation of labelled probe and anchor sequences to a DNA strand. • The probes encode one or two known bases and a series of degenerate or universal bases, driving complementary binding between the probe and template. • The anchor fragment encodes a known sequence that is complementary to an adapter sequence and provides a site to initiate ligation. • After ligation, the template is imaged and the known base or bases in the probe are identified. • A new cycle begins after complete removal of the anchor–probe complex or through cleavage to remove the fluorophore and to regenerate the ligation site. • Platforms: SOLiD and Complete Genomics
  • 12. Sequencing by ligation (SBL) • Involve the hybridization and ligation of labelled probe and anchor sequences to a DNA strand. • The probes encode one or two known bases and a series of degenerate or universal bases, driving complementary binding between the probe and template. • The anchor fragment encodes a known sequence that is complementary to an adapter sequence and provides a site to initiate ligation. • After ligation, the template is imaged and the known base or bases in the probe are identified. • A new cycle begins after complete removal of the anchor–probe complex or through cleavage to remove the fluorophore and to regenerate the ligation site. • Platforms: SOLiD and Complete Genomics
  • 13. SBL: SOLiD  Sequencing by Oligonucleotide Ligation and Detection utilizes two-base-encoded probes, in which each fluorometric signal represents a dinucleotide.  The raw output is not directly associated with the incorporation of a known nucleotide. Because the 16 possible dinucleotide combinations cannot be individually associated with spectrally resolvable fluorophores, four fluorescent signals are used, each representing a subset of four dinucleotide combinations.  Each ligation signal represents one of several possible dinucleotides, leading to the term colour-space (rather than base-space ), which must be deconvoluted during data analysis.  The SOLiD sequencing procedure is composed of a series of probe–anchor binding, ligation, imaging and cleavage cycles to elongate the complementary strand.  Over the course of the cycles, single-nucleotide offsets are introduced to ensure every base in the template strand is sequenced.
  • 14. SBL: Complete Genomics performs DNA sequencing using combinatorial probe–anchor ligation (cPAL) or combinatorial probe–anchor synthesis (cPAS). In cPAL , an anchor sequence (complementary to one of the four adaptor sequences) and a probe hybridize to a DNA nanoball at several locations. In each cycle, the hybridizing probe is a member of a pool of one-base-encoded probes, in which each probe contains a known base in a constant position and a corresponding fluorophore. After imaging, the entire probe–anchor complex is removed and a new probe–anchor combination is hybridized. Each sub-sequent cycle utilizes a probe set with the known base in the n + 1 position. Further cycles in the process also use adaptors of variable lengths and chemistries, allowing sequencing to occur upstream and downstream of the adaptor sequence. The cPAS approach is a modification of cPAL intended to increase read lengths of Complete Genomics’ chemistry.
  • 15. Sequencing by synthesis(SBS) • SBS is a term usedto describe numerous DNA-polymerase-dependent methods. • SBS approaches can be classified as:  Cyclic reversible termination (CRT)  Single-nucleotide addition (SNA)  Platforms: Illumina, Qiagen, 454, Ion Torrent
  • 16. SBS: CRT  CRT approaches are defined by their use of terminator molecules that are similar to those used in Sanger sequencing, in which the ribose 3ʹ‐OH group is blocked, thus preventing elongation.  To begin the process, a DNA template is primed by a sequence that is complementary to an adapter region, which will initiate polymerase binding to this double-stranded DNA (dsDNA) region.  During each cycle, a mixture of all four individually labelled and 3ʹ‐blocked deoxynucleotides (dNTPs) are added. After the incorporation of a single dNTP to each elongating complementary strand, unbound dNTPs are removed and the surface is imaged to identify which dNTP was incorporated at each cluster.  The fluorophore and blocking group can then be removed and a new cycle can begin.  Platform: Illumina, Qiagen
  • 17. SBS: CRT [Illumina] • Accounts for the largest market share for sequencing instruments. • Illumina’s suite of instruments for short-read sequencing range from small, low-throughput benchtop units to large ultra-high throughput instruments dedicated to population-level whole-genome sequencing (WGS). • dNTP identification is achieved through total internal reflection fluorescence (TIRF) microscopy using either two or four laser channels. • In most Illumina platforms, each dNTP is bound to a single fluorophore that is specific to that base type and requires four different imaging channels, whereas the NextSeq and Mini-Seq systems use a two-fluorophore system . TIRFM is a type of microscope with which a thin region of a specimen, usually less than 200 nanometers can be observed. A video of Illumina Sequencing by Synthesis is available at https://youtu.be/fCd6B5HRaZ8
  • 18. SBS: CRT [Qiagen] • GeneReader is intended to be an all‐in‐one NGS platform, from sample preparation to analysis. • To accomplish this, the GeneReader system is bundled with the QIAcube sample preparation system and the Qiagen Clinical Insight platform for variant analysis. • The GeneReader uses virtually the same approach as that used by Illumina; however, it does not aim to ensure that each template incorporates a fluorophore labelled dNTP. • Rather, GeneReader aims to ensure that just enough labelled dNTPs are incorporated to achieve identification.
  • 19. SBS: SNA  SNA approaches rely on a single signal to mark the incorporation of a dNTP into an elongating strand.  Each of the four nucleotides must be added iteratively to a sequencing reaction to ensure only one dNTP is responsible for the signal.  This does not require the dNTPs to be blocked, as the absence of the next nucleotide in the sequencing reaction prevents elongation.  The exception to this is homopolymer regions where identical dNTPs are added, with sequence identification relying on a proportional increase in the signal as multiple dNTPs are incorporated.  Platforms: 454, Ion Torrent
  • 20. SBS: SNA [454] • The first NGS instrument developed was the 454 pyrosequencing device. • This SNA system distributes template-bound beads into a PicoTiterPlate along with beads containing an enzyme cocktail. As a dNTP is incorporated into a strand, an enzymatic cascade occurs, resulting in a bioluminescence signal. Each burst of light, detected by a charge-coupled device (CCD) camera, can be attributed to the incorporation of one or more identical dNTPs at a particular bead.
  • 21. SBS: SNA [Ion Torrent] • The Ion Torrent was the first NGS platform without optical sensing. • Doesnot depend on enzymatic reaction to generate a signal. • It detects the H+ ions that are released as each dNTP is incorporated. • The resulting change in pH is detected by an integrated complementary metal-oxide--semiconductor (CMOS) and an ion-sensitive field-effect transistor (ISFET). • The pH change detected by the sensor is imperfectly proportional to the number of nucleotides detected, allowing for limited accuracy in measuring homopolymer lengths.
  • 22. Long-read sequencing • Genomes are highly complex with many long repetitive elements, copy number alterations and structural variations that are relevant to evolution, adaptation and disease. • Many of these complex elements are so long that short-read paired-end technologies are insufficient to resolve them. • Long-read sequencing delivers reads in excess of several kilobases, allowing for the resolution of these large structural features. • Long reads can span complex or repetitive regions with a single continuous read, thus eliminating ambiguity in the positions or size of genomic elements. • Long reads can also be useful for transcriptomic research, as they are capable of spanning entire mRNA transcripts, allowing researchers to identify the precise connectivity of exons and discern gene isoforms. • Two main technologies:  Real-time long-read sequencing  Synthetic approaches
  • 23. Real-time long-read sequencing • The single-molecule approaches differ from short-read approaches in that they do not rely on a clonal population of amplified DNA fragments to generate detectablesignal, nor do they require chemical cycling for each dNTP added. • Platforms: SMRT-PacBio, MinION, llumina synthetic long-read sequencing, 10X Genomics emulsion-based system
  • 24. Real-time long-read sequencing: [SMRT-PacBio] • Uses a specialized flow cell with many thousands of individual picolitre wells with transparent bottoms — zero-mode waveguides (ZMW). • Polymerase is fixed at the bottom of the well and allows the DNA strand to progress through the ZMW. • By having a constant location of incorporation owing to the stationary enzyme, the system can focus on a single molecule. • dNTP incorporation on each single-molecule template per well is continuously visualized with a laser and camera system that records the colour and duration of emitted light as the labelled nucleotide momentarily pauses during incorporation at the bottom of the ZMW. • The polymerase cleaves the dNTP-bound fluorophore during incorporation, allowing it to diffuse away from the sensor area before the next labelled dNTP is incorporated. • SMRT uses a unique circular template that allows each template to be sequenced multiple times as the polymerase repeatedly traverses the circular molecule. • Although it is difficult for DNA templates longer than ~3 kb to be sequenced multiple times, shorter DNA templates can be sequenced many times as a function of template length. • These multiple passes are used to generate a consensus read of insert , known as a circular consensus sequence (CCS).
  • 25. Real-time long-read sequencing: [MinION] • The first consumer prototype of the nanopore sequencer was made available in 2014 by Oxford Nanopore Technologies (ONT). • This do not monitor incorporations or hybridizations of nucleotides guided by a template DNA strand. • Whereas other platforms use a secondary signal, light, colour or pH, nanopore sequencers directly detect the DNA composition of a native ssDNA molecule. • To carry out sequencing, DNA is passed through a protein pore as current is passed through the pore. • As the DNA translocates through the action of a secondary motor protein, a voltage blockade occurs that modulates the current passing through the pore. • The temporal tracing of these charges is called squiggle space , and shifts in voltage are characteristic of the particular DNA sequence in the pore, which can then be interpreted as a k‐mer. • Rather than having 1–4 possible signals, the instrument has more than 1,000 — one for each possible k‐mer, especially when modified bases present on native DNA are taken into account. • The ONT MinION uses a leader hairpin library structure. • This allows the forward DNA strand to pass through the pore, followed by a hairpin that links the two strands, and finally the reverse strand. This generates 1D and 2D reads in which both ‘1D’ strands can be aligned to create a consensus sequence ‘2D’ read.
  • 27. Synthetic approaches • The synthetic approaches do not generate actual long-reads; rather, they are an approach to library preparation that leverages barcodes to allow computational assembly of a larger fragment. • Platforms: llumina synthetic long-read sequencing,10X Genomics emulsion-based system
  • 28. Synthetic approaches [llumina synthetic long-read sequencing] • The Illumina system (formerly Moleculo) partitions DNA into a microtitre plate and does not require specialized instrumentation.
  • 29. Synthetic approaches [10X Genomics] • 10X Genomics instruments (GemCode and Chromium) use emulsion to partition DNA and require the use of a microfluidic instrument to perform pre-sequencing reactions. • With as little as 1 ng of starting material, the 10X Genomics instruments can partition arbitrarily large DNA fragments, up to ~100 kb, into micelles called ‘GEMs’, which typically contain ≤0.3× copies of the genome and one unique barcode. • Within each GEM, a gel bead dissolves and smaller fragments of DNA are amplified from the original large fragments, each with a barcode identifying the source GEM. • After sequencing, the reads are aligned and linked together to form a series of anchored fragments across the span of the original fragment.
  • 30. Applications • Rapidly sequence whole genomes. • Zoom in to deeply sequence target regions. • Utilize RNA sequencing (RNA-Seq) to discover novel RNA variants and splice sites, or precisely quantify mRNAs for gene expression analysis. • Analyze epigenetic factors such as genome-wide DNA methylation and DNA-protein interactions. • Sequence cancer samples to study rare somatic variants, tumor subclones, and more. • Study microbial diversity.
  • 31. NGS Experimental Considerations • Paired-End vs. Single-Read Sequencing • Multiplex Sequencing • Mate Pair Sequencing • Deep Sequencing • Sequencing Coverage
  • 32. Paired-End vs. Single-Read Sequencing • In single-end reading, the sequencer reads a fragment from only one end to the other, generating the sequence of base pairs. • In paired-end reading it starts at one read, finishes this direction at the specified read length, and then starts another round of reading from the opposite end of the fragment.. • Paired-end sequencing facilitates detection of genomic rearrangements and repetitive sequence elements, as well as gene fusions and novel transcripts. • Since paired-end reads are more likely to align to a reference, the quality of the entire data set improves.
  • 33. Multiplex Sequencing • Process a large number of samples with multiplex sequencing on a high-throughput instrument. • Sample multiplexing is a useful technique when targeting specific genomic regions or working with smaller genomes. • To accomplish this, individual "barcode" sequences are added to each sample so they can be distinguished and sorted during data analysis. • Pooling samples exponentially increases the number of samples analyzed in a single run, without drastically increasing cost or time.
  • 34. Mate Pair Sequencing • Mate pair sequencing involves generating long-insert paired-end DNA libraries useful for a number of sequencing applications, including:  De novo sequencing  Genome finishing  Structural variant detection  Identification of complex genomic rearrangements • Difference from paired-end sequencing: Longer read length (>800bp) • Method: First DNA is fragmented and fragments of a desired length (2-5 kb) are isolated. Afterwards the ends of the DNA fragments are biotinylated (adding Biotine). The biotinylated ends leads to a circularizing of the fragments. Then the DNA ring is crushed into smaller fragments (400-600 bp). Biotinylated fragments are enriched (by biotin tag) and adapters are ligated. They are then ready for cluster generation and sequencing. The trick here is that the produced fragment (400-600 bp) contains the ends of the original long fragment (2-5 kb) and can be sequenced now. After sequencing you therefore get information about the original fragment. • Combining data from mate pair sequencing with that from short-insert paired- end reads provides increased information for maximising sequencing coverage across a genome
  • 35. Deep Sequencing • Deep sequencing refers to sequencing a genomic region multiple times, sometimes hundreds or even thousands of times. • This NGS approach allows researchers to detect rare clonal types, cells, or microbes comprising as little as 1% of the original sample. • Deep sequencing is useful for studies in oncology, microbial genomics, and other research involving analysis of rare cell populations. • For example, deep sequencing is required to identify mutations within tumors, because normal cell contamination is common in cancer samples, and the tumors themselves likely contain multiple sub-clones of cancer cells.
  • 36. Sequencing Coverage • Sequencing coverage describes the average number of reads that align to, or "cover," known reference bases. • The next-generation sequencing (NGS) coverage level often determines whether variant discovery can be made with a certain degree of confidence at particular base positions. • Sequencing coverage requirements vary by application as well as on other factors such as size of reference genome, gene expression level, published literature, and best practices defined by the scientific community. • At higher levels of coverage, each base is covered by a greater number of aligned sequence reads, so base calls can be made with a higher degree of confidence. • Examples of sequencing coverage recommendations for some common applications include: • For detecting human genome mutations, SNPs, and rearrangements, publications often recommend from 10× to 30× depth of coverage, depending on the application and statistical model. • For RNA sequencing, researchers usually think in terms of numbers of millions of reads to be sampled. Detecting rarely expressed genes often requires an increase in the depth of coverage. • For ChIP-Seq (chromatin immunoprecipitation sequencing), publications often recommend coverage of around 100x. Average coverage = N * L / G Where: G is length of the original genome, N is the number of reads, and L is the average read length
  • 37. Sequencing Coverage Histograms • Coverage histograms are commonly used to depict the range and uniformity of sequencing coverage for an entire data set. • They illustrate the overall coverage distribution by displaying the number of reference bases that are covered by mapped sequencing reads at various depths. • Mapped read depth refers to the total number of bases sequenced and aligned at a given reference base position (note that "mapped" and "aligned" are used interchangeably in the sequencing community). • In a sequencing coverage histogram, the read depths are binned and displayed on the x-axis, while the total numbers of reference bases that occupy each read depth bin are displayed on the y-axis. These can also be written as percentages of reference bases.
  • 38. Evaluating NGS Coverage • Inter-Quartile Range (IQR): The IQR is the difference in sequencing coverage between the 75th and 25th percentiles of the histogram. This value is a measure of statistical variability, reflecting the non-uniformity of coverage across the entire data set. A high IQR indicates high variation in coverage across the genome, while a low IQR reflects more uniform sequence coverage. In the shown histograms, the lower IQR indicates that the histogram on the left has better sequencing coverage uniformity than that on the right. • Mean (Mapped) Read Depth: The mean mapped read depth (or mean read depth) is the sum of the mapped read depths at each reference base position, divided by the number of known bases in the reference. The mean read depth metric indicates how many reads, on average, are likely to be aligned at a given reference base position. • Raw Read Depth: This is the total amount of sequence data produced by the instrument (pre-alignment), divided by the reference genome size. Although raw read depth is often provided by sequencing instrument vendors as a specification, it does not take into account the efficiency of the alignment process. If a large fraction of the raw sequencing reads are discarded during the alignment process, the post-alignment mapped read depth can be significantly smaller than the raw read depth.
  • 39. References: • Coming of age: ten years of next-generation sequencing technologies. Sara Goodwin, John D. McPherson and W. Richard McCombie. Nature Reviews: Genetics. doi:10.1038/nrg.2016.49 • Illumina: https://www.illumina.com • ecSeq Bioinformatics: https://www.ecseq.com • Wikipedia: https://en.wikipedia.org/ • Columbia Genome Centre: https://systemsbiology.columbia.edu
  • 40. Next Generation Sequencing Glossary can be found at:  http://sabiosciences.com/NGS_Glossary.php  http://deeptools.readthedocs.io/en/latest/content/help_glossary.html  https://www.nextgenerationsequencing.info/ngs-introduction/ngs-glossary  http://www.nslc.wustl.edu/elgin/genomics/tour_nextgen/glossary.pdf