SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Harrison Leong, Edgar Schreiber, Stephan Berosik, Shiaw-Min Chen, Wallace George, Jeffrey Marks, Stephanie Schneider
ThermoFisher Scientific, Genetic Sciences Division, 200 Oyster Point Blvd., South San Francisco, CA, 94080
RESULTS
Table 1 shows the sensitivity and specificity for allele frequencies 5% and
10%. Although we have been able to detect variants at allele frequencies
0.6125%, 1%, 1.25%, 2%, and 2.5%, the algorithm did not meet the LOD
criteria of 95% sensitivity and 99% specificity for these extremely low levels.
INTRODUCTION
Detecting minor genetic variants has become essential to cancer
and infectious disease management. Many have turned to next
generation sequencing to fill this need given the common
perception that the limit of detection (LOD) for Sanger sequencing
is somewhere between 15% to 25%1,2,3. We have discovered a
software algorithmic solution to reduce this detection limit to 5%
and have demonstrated detection at even lower allele frequencies.
Standard Sanger sequencing protocols can be used and the
method can generate the familiar electropherogram data display
with noise substantially reduced. This opens up an alternative for
detecting low level somatic variants.
The key observation that enabled this development is that the noise
underlying Sanger sequencing fluorescence data (traces) appears
to be highly correlated to the primary sequence in the data. Figure
1 shows the electropherograms from two different samples: the
control sample has the same primary sequence as the test sample
which contains a few minor variants.
CONCLUSIONS
It should now be possible to achieve a reference-based limit of detection of
5% allelic proportion with standard Sanger sequencing protocols. Existing
protocols for visually reviewing the results can also be used and are
enhanced because the algorithm generates results in the form of familiar
electropherograms for which the noise has been substantially diminished.
These two features of the algorithm may give Sanger sequencing
performance and/or economic advantages in some molecular diagnostic
applications that require finding minor genetic variants.
NOTE: Results on clinical samples can be found at
www.thermofisher.com/sangeroncology. The algorithm has been
embedded within ThermoFisher Scientific’s Minor Variant Finder software
(www.thermofisher.com/mvf ).
REFERENCES
1.  Lin, M.T. et al. (2014), American Journal of Clinical Pathology, June 2014; 141:856-866.
2.  Jancik, S. et al. (2012), Journal of Experimental & Clinical Cancer Research 2012;
31:79:1-13.
3.  Tsiatis, A.C. et al. (2010), Journal of Molecular Diagnostics, July 2010; 12:4:425-432.
4.  Wang G. and Guo L. (2013) Journal of Applied Mathematics, 2013; article 696491.
High Sensitivity Sanger Sequencing for Minor Variant Detection
Thermo Fisher Scientific • 5781 Van Allen Way • Carlsbad, CA 92008 • thermofisher.com
TT27
These are the key steps in the noise minimization algorithm:
a)  for each of control and test traces, find the range of base
positions where the sequence data quality is consistently high;
b)  find the intersection of the high-sequence-quality ranges between
the control and test sample traces; do the following within that
intersection:
c)  remove the trace components associated with the primary bases
leaving the non-primary traces;
d)  locally expand or contract and/or strengthen or weaken the non-
primary traces of the control sample to maximize correspondence
between the non-primary traces of the control and test samples;
e)  subtract the manipulated non-primary traces from the test sample
traces;
f)  suppress non-primary peaks that are obviously not variant peaks
(set them to zero) based on several peak characteristics such as
amplitude, width, alignment with the primary trace peak, etc.
This process is applied to traces from both forward and reverse
sequencing reactions. The outcome is noise minimized traces for
forward and reverse traces of the test sample that can be displayed for
review in the familiar electropherogram format. These traces are
passed into the second stage of the algorithm to automatically detect
variants.
AUTOMATED VARIANT DETECTION
For variant detection, the forward and reverse noise-minimized traces of
the test sample are examined for any remaining peaks. These peaks
are scrutinized by a set of five interconnected multi-variate classification
functions to decide whether or not there is a bona fide variant at a given
base position and its base identity. The final thresholds of four of these
functions are optimized for classification accuracy using an algorithm
based on swarm theory 4.
DATA FOR DEVELOPING AND TESTING THE METHOD
Samples came from 22 amplicons associated with eight different
genes: TP53, KRAS, BRAF, EGFR, FLT3, RB1, CDH1, and
ERBB2. Many of these were extracted from formalin-fixed,
paraffin-embedded samples. Some were commercially available
reference standards (Acrometrix), others were quantified using the
RNase-P quantitative polymerase chain reaction assay and serially
diluted. Allelic proportions spanned 0.6125% to 50%. These
samples were amplified, sequenced, and pre-processed using
standard protocols and tools for fluorescent dye terminator Sanger
sequencing from Applied BiosystemsTM.
A third of these data were used for developing the algorithms. Two
thirds were used for evaluating the performance characteristics of
the method.
Figure 1. Noise underlying two different samples looks very
similar when their primary sequences are the same.
Control Sample
Test Sample with Variants
Figure 1: Electropherograms from two different physical samples showing the underlying
noise; note the close similarity between the two. The bottom 200 relative fluorescence units
(RFUs) is shown. The primary peaks are up at around 1000 RFUs.
A two-part algorithm has been developed to exploit this
observation. The first part minimizes the noise that underlies the
traces. The second part detects variants, if any, in the noise-
minimized traces. This communication describes the algorithmic
details and shows test results.
f)  Use a global optimization algorithm (one based on swarm
theory was used) to find optimum final threshold values for the
four discriminant functions of step (c) and (e).
The classifiers of step (e) deliver the final judgment on whether a
peak is associated with a variant or non-variant. Figure 2
summarizes the complete classification engine.
MATERIALS AND METHODS
NOISE MINIMIZATION
For noise minimization, a model of the noise in the traces of the test
sample is made from traces of the control sample and this model is
subtracted from the traces of the test sample.
The key steps of the variant detection algorithm, including construction
of the classification engine, are as follows:
a)  Compute metrics on trace peaks such as the location of a peak
relative to that of the nearest primary base, symmetry of the peak,
sharpness of a peak relative to that of its nearest primary base,
etc.;
b)  Classify the largest non-primary peaks based on each peak metric
alone to the degree that a peak can be unambiguously classified
in this manner;
c)  Of those that cannot be classified in step (b), construct two
discriminant functions, one based on peak metrics that combine
forward and reverse information (x-strand), one based on peak
metrics that do not combine the two (s-strand). Within each of
these two categories, generate discriminant functions for all
possible combinations of metrics belonging to the category and
choose the function with the highest performance;
d)  Use the s-strand classifier to generate additional peak metrics;
e.g., the probability ratio between variant and non-variant peaks
based on s-strand pre-thresholded output;
e)  Make two additional discriminant functions: one for peaks
categorized as variants by the x-strand classifier and the second
for peaks categorized as non-variants by the x-strand classifier.
Metrics of steps (c) and (d) are used to create these discriminant
functions using the feature selection process of (c).
INPUT DATA:
Forward control
Forward test
Reverse control
Reverse test
Fwd and Rvs
test, noise
minimized
Single-strand metrics: peak
height, width, sharpness,
symmetry, signal to noise, etc.
Cross-strand metrics
(combined fwd rvs information):
base complementarity, relative
peak amplitude, relative width,
etc.
Signal to noise outliers are variant candidates
Classifier for clear cut cases
Classifier based on
cross-strand metrics
Classifier to override
cross-strand variant calls
(all metrics)
Classifier to override cross-
strand non-variant calls (all
metrics)
OUTPUT RESULTS:
Variant locations and base identities
Meta-metrics from
classifier based on single-
strand metrics: var/non-var
probability ratio, pre-
thresholded score, etc.
Figure 2. The classification engine for variant detection.
Figure 2: Trace data enters at the upper left and detected variants, if any, are reported out at
the bottom. The figure illustrates that the decision making process is layered so that easy
decisions are made first and only the trace peaks that cannot be clearly classified are
funneled down into the deeper levels of analysis. This allows the classifier at each level to
concentrate on a smaller set of the data which may have a simpler statistical structure.
Figure 3 shows results of applying noise minimization to the
forward sequencing orientation of a sample with three variants at
an allele frequency of 1.25%. The central panel shows the traces
before minimization. The process clearly reveals the three variant
peaks. The red marks in the bottom panel indicate where the
automated variant detection algorithm called out variants.
1.25% Variant Test Sample
Control Sample
algorithm finds the variants
KB Basecaller misses the variants
1.25% Test Sample, Noise Minimized
Figure 3. Noise minimized trace example (bottom panel).
Figure 3: Noise minimization reveals 1.25% minor variants deeply embedded in the noise underlying
Sanger trace data. The high similarity in the noise between the control (top panel) and test (middle
panel) traces allows much of the noise to be removed (bottom panel).
TABLE 1: Algorithm performance for allele frequencies meeting LOD criteria
Variant
Level
Sensitivity Specificity Datasets Total True
Variants
Total True
Non-variants
5% 95.9% 99.8% 704 785 229623
10% 98.8% 99.8% 454 503 163037

Weitere ähnliche Inhalte

Was ist angesagt?

Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
Thermo Fisher Scientific
 
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
Thermo Fisher Scientific
 
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNAImprovement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Thermo Fisher Scientific
 
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
Thermo Fisher Scientific
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutation
Elsa von Licy
 

Was ist angesagt? (20)

High-throughput processing to maximize genomic analysis through simultaneous ...
High-throughput processing to maximize genomic analysis through simultaneous ...High-throughput processing to maximize genomic analysis through simultaneous ...
High-throughput processing to maximize genomic analysis through simultaneous ...
 
Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
Oncomine Cancer Research Panel (OCP) | ESHG 2015 Poster PS12.131
 
Orthogonal Verification of Oncomine cfDNA Data with Digital PCR Using TaqMan ...
Orthogonal Verification of Oncomine cfDNA Data with Digital PCR Using TaqMan ...Orthogonal Verification of Oncomine cfDNA Data with Digital PCR Using TaqMan ...
Orthogonal Verification of Oncomine cfDNA Data with Digital PCR Using TaqMan ...
 
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
 
Sequencing the circulating and infiltrating T-cell repertoire on the Ion S5TM
Sequencing the circulating and infiltrating T-cell repertoire on the Ion S5TMSequencing the circulating and infiltrating T-cell repertoire on the Ion S5TM
Sequencing the circulating and infiltrating T-cell repertoire on the Ion S5TM
 
Rare Mutation Analysis Using Digital PCR on QuantStudio™ 3D to Verify Ion Amp...
Rare Mutation Analysis Using Digital PCR on QuantStudio™ 3D to Verify Ion Amp...Rare Mutation Analysis Using Digital PCR on QuantStudio™ 3D to Verify Ion Amp...
Rare Mutation Analysis Using Digital PCR on QuantStudio™ 3D to Verify Ion Amp...
 
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
High Sensitivity Sanger Sequencing for Minor Indel Detection and Characteriza...
 
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
Detection of somatic mutations at 0.5% frequency from cfDNA and CTC DNA using...
 
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNAImprovement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
Improvement of TMB Measurement by removal of Deaminated Bases in FFPE DNA
 
A computational framework for large-scale analysis of TCRβ immune repertoire ...
A computational framework for large-scale analysis of TCRβ immune repertoire ...A computational framework for large-scale analysis of TCRβ immune repertoire ...
A computational framework for large-scale analysis of TCRβ immune repertoire ...
 
An NGS workflow to detect down to 0.1% allelic frequency in cfDNA
An NGS workflow to detect down to 0.1% allelic frequency in cfDNAAn NGS workflow to detect down to 0.1% allelic frequency in cfDNA
An NGS workflow to detect down to 0.1% allelic frequency in cfDNA
 
Defining the relevant genome in solid tumors
Defining the relevant genome in solid tumorsDefining the relevant genome in solid tumors
Defining the relevant genome in solid tumors
 
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
Evaluation of ctDNA extraction methods and amplifiable copy number yield usin...
 
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
Gene expression profile of the tumor microenvironment from 40 NSCLC FFPE and ...
 
TaqMan dPCR Liquid Biopsy Assays targeting the TERT promoter region
TaqMan dPCR Liquid Biopsy Assays targeting the TERT promoter regionTaqMan dPCR Liquid Biopsy Assays targeting the TERT promoter region
TaqMan dPCR Liquid Biopsy Assays targeting the TERT promoter region
 
Creating custom gene panels for next-generation sequencing: optimization of 5...
Creating custom gene panels for next-generation sequencing: optimization of 5...Creating custom gene panels for next-generation sequencing: optimization of 5...
Creating custom gene panels for next-generation sequencing: optimization of 5...
 
A Next-Generation Sequencing Assay to Estimate Tumor Mutation Load at > 5% Al...
A Next-Generation Sequencing Assay to Estimate Tumor Mutation Load at > 5% Al...A Next-Generation Sequencing Assay to Estimate Tumor Mutation Load at > 5% Al...
A Next-Generation Sequencing Assay to Estimate Tumor Mutation Load at > 5% Al...
 
Development of a high throughput workflow for genotyping CFTR mutations
Development of a high throughput workflow for genotyping CFTR mutationsDevelopment of a high throughput workflow for genotyping CFTR mutations
Development of a high throughput workflow for genotyping CFTR mutations
 
Custom AmpliSeq™ Panels for Inherited Disease Research from Optimized, Invent...
Custom AmpliSeq™ Panels for Inherited Disease Research from Optimized, Invent...Custom AmpliSeq™ Panels for Inherited Disease Research from Optimized, Invent...
Custom AmpliSeq™ Panels for Inherited Disease Research from Optimized, Invent...
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutation
 

Andere mochten auch

Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
Dayananda Salam
 

Andere mochten auch (10)

Analysis of strategic marketing of Tata Motors
Analysis of strategic marketing of Tata Motors Analysis of strategic marketing of Tata Motors
Analysis of strategic marketing of Tata Motors
 
Information Genetic Content (IGC): a comprehensive discovery platform for dis...
Information Genetic Content (IGC): a comprehensive discovery platform for dis...Information Genetic Content (IGC): a comprehensive discovery platform for dis...
Information Genetic Content (IGC): a comprehensive discovery platform for dis...
 
Tata Motors Company Analysis Report 2015-2016
Tata Motors Company Analysis Report 2015-2016Tata Motors Company Analysis Report 2015-2016
Tata Motors Company Analysis Report 2015-2016
 
Biological thermodynamics
Biological thermodynamicsBiological thermodynamics
Biological thermodynamics
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Thermodynamics
ThermodynamicsThermodynamics
Thermodynamics
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology Overview
 
NGS technologies - platforms and applications
NGS technologies - platforms and applicationsNGS technologies - platforms and applications
NGS technologies - platforms and applications
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 

Ähnlich wie High Sensitivity Sanger Sequencing for Minor Variant Detection

презентация за варшава
презентация за варшавапрезентация за варшава
презентация за варшава
Valeriya Simeonova
 
Channel Attenuation Presentation _Updated_
Channel Attenuation Presentation _Updated_Channel Attenuation Presentation _Updated_
Channel Attenuation Presentation _Updated_
Khade Grant
 
Fault diagnosis of a high voltage transmission line using waveform matching a...
Fault diagnosis of a high voltage transmission line using waveform matching a...Fault diagnosis of a high voltage transmission line using waveform matching a...
Fault diagnosis of a high voltage transmission line using waveform matching a...
ijsc
 
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
ijsc
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
butest
 

Ähnlich wie High Sensitivity Sanger Sequencing for Minor Variant Detection (20)

Analog circuit fault diagnosis via FOA-LSSVM
Analog circuit fault diagnosis via FOA-LSSVMAnalog circuit fault diagnosis via FOA-LSSVM
Analog circuit fault diagnosis via FOA-LSSVM
 
презентация за варшава
презентация за варшавапрезентация за варшава
презентация за варшава
 
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
 
Peak detection cwt
Peak detection cwtPeak detection cwt
Peak detection cwt
 
APART Automated Preprocessing For NMR Assignments With Reduced Tedium
APART  Automated Preprocessing For NMR Assignments With Reduced TediumAPART  Automated Preprocessing For NMR Assignments With Reduced Tedium
APART Automated Preprocessing For NMR Assignments With Reduced Tedium
 
IMPLEMENTATION OF COMPACTION ALGORITHM FOR ATPG GENERATED PARTIALLY SPECIFIED...
IMPLEMENTATION OF COMPACTION ALGORITHM FOR ATPG GENERATED PARTIALLY SPECIFIED...IMPLEMENTATION OF COMPACTION ALGORITHM FOR ATPG GENERATED PARTIALLY SPECIFIED...
IMPLEMENTATION OF COMPACTION ALGORITHM FOR ATPG GENERATED PARTIALLY SPECIFIED...
 
PERFORMANCE ASSESSMENT OF ANFIS APPLIED TO FAULT DIAGNOSIS OF POWER TRANSFORMER
PERFORMANCE ASSESSMENT OF ANFIS APPLIED TO FAULT DIAGNOSIS OF POWER TRANSFORMER PERFORMANCE ASSESSMENT OF ANFIS APPLIED TO FAULT DIAGNOSIS OF POWER TRANSFORMER
PERFORMANCE ASSESSMENT OF ANFIS APPLIED TO FAULT DIAGNOSIS OF POWER TRANSFORMER
 
Cooperative Spectrum Sensing Technique Based on Blind Detection Method
Cooperative Spectrum Sensing Technique Based on Blind Detection MethodCooperative Spectrum Sensing Technique Based on Blind Detection Method
Cooperative Spectrum Sensing Technique Based on Blind Detection Method
 
Channel Attenuation Presentation _Updated_
Channel Attenuation Presentation _Updated_Channel Attenuation Presentation _Updated_
Channel Attenuation Presentation _Updated_
 
ECG_based_Biometric_Recognition_using_Wa.pdf
ECG_based_Biometric_Recognition_using_Wa.pdfECG_based_Biometric_Recognition_using_Wa.pdf
ECG_based_Biometric_Recognition_using_Wa.pdf
 
Fault diagnosis of a high voltage transmission line using waveform matching a...
Fault diagnosis of a high voltage transmission line using waveform matching a...Fault diagnosis of a high voltage transmission line using waveform matching a...
Fault diagnosis of a high voltage transmission line using waveform matching a...
 
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
Fault Diagnosis of a High Voltage Transmission Line Using Waveform Matching A...
 
H04544759
H04544759H04544759
H04544759
 
Applications of Artificial Neural Networks in Cancer Prediction
Applications of Artificial Neural Networks in Cancer PredictionApplications of Artificial Neural Networks in Cancer Prediction
Applications of Artificial Neural Networks in Cancer Prediction
 
19 9742 the application paper id 0016(edit ty)
19 9742 the application paper id 0016(edit ty)19 9742 the application paper id 0016(edit ty)
19 9742 the application paper id 0016(edit ty)
 
Validation of Analytical Procedures.pdf
Validation of Analytical Procedures.pdfValidation of Analytical Procedures.pdf
Validation of Analytical Procedures.pdf
 
Analytical control strategy 3
Analytical control strategy 3Analytical control strategy 3
Analytical control strategy 3
 
Shorter Multi-marker Signatures: a new tool to facilitate cancer diagnosis
Shorter Multi-marker Signatures:  a new tool to facilitate cancer diagnosisShorter Multi-marker Signatures:  a new tool to facilitate cancer diagnosis
Shorter Multi-marker Signatures: a new tool to facilitate cancer diagnosis
 
Shorter Multimarker signatures: a new tool to facilitate cancer diagnosis
Shorter Multimarker signatures:  a new tool to facilitate cancer diagnosisShorter Multimarker signatures:  a new tool to facilitate cancer diagnosis
Shorter Multimarker signatures: a new tool to facilitate cancer diagnosis
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 

Mehr von Thermo Fisher Scientific

TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
Thermo Fisher Scientific
 

Mehr von Thermo Fisher Scientific (20)

Why you would want a powerful hot-start DNA polymerase for your PCR
Why you would want a powerful hot-start DNA polymerase for your PCRWhy you would want a powerful hot-start DNA polymerase for your PCR
Why you would want a powerful hot-start DNA polymerase for your PCR
 
TCRB chain convergence in chronic cytomegalovirus infection and cancer
TCRB chain convergence in chronic cytomegalovirus infection and cancerTCRB chain convergence in chronic cytomegalovirus infection and cancer
TCRB chain convergence in chronic cytomegalovirus infection and cancer
 
What can we learn from oncologists? A survey of molecular testing patterns
What can we learn from oncologists? A survey of molecular testing patternsWhat can we learn from oncologists? A survey of molecular testing patterns
What can we learn from oncologists? A survey of molecular testing patterns
 
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
Novel Spatial Multiplex Screening of Uropathogens Associated with Urinary Tra...
 
Liquid biopsy quality control – the importance of plasma quality, sample prep...
Liquid biopsy quality control – the importance of plasma quality, sample prep...Liquid biopsy quality control – the importance of plasma quality, sample prep...
Liquid biopsy quality control – the importance of plasma quality, sample prep...
 
Streamlined next generation sequencing assay development using a highly multi...
Streamlined next generation sequencing assay development using a highly multi...Streamlined next generation sequencing assay development using a highly multi...
Streamlined next generation sequencing assay development using a highly multi...
 
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
Targeted T-cell receptor beta immune repertoire sequencing in several FFPE ti...
 
Development of Quality Control Materials for Characterization of Comprehensiv...
Development of Quality Control Materials for Characterization of Comprehensiv...Development of Quality Control Materials for Characterization of Comprehensiv...
Development of Quality Control Materials for Characterization of Comprehensiv...
 
A High Throughput System for Profiling Respiratory Tract Microbiota
A High Throughput System for Profiling Respiratory Tract MicrobiotaA High Throughput System for Profiling Respiratory Tract Microbiota
A High Throughput System for Profiling Respiratory Tract Microbiota
 
A high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer researchA high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer research
 
Why is selecting the right thermal cycler important?
Why is selecting the right thermal cycler important?Why is selecting the right thermal cycler important?
Why is selecting the right thermal cycler important?
 
A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...A rapid library preparation method with custom assay designs for detection of...
A rapid library preparation method with custom assay designs for detection of...
 
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
Generation of Clonal CRISPR/Cas9-edited Human iPSC Derived Cellular Models an...
 
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
TaqMan®Advanced miRNA cDNA synthesis kit to simultaneously study expression o...
 
Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...
 
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
Evidence for antigen-driven TCRβ chain convergence in the melanoma-infiltrati...
 
Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...
 
Estimating Mutation Load from Tumor Research Samples using a Targeted Next-Ge...
Estimating Mutation Load from Tumor Research Samples using a Targeted Next-Ge...Estimating Mutation Load from Tumor Research Samples using a Targeted Next-Ge...
Estimating Mutation Load from Tumor Research Samples using a Targeted Next-Ge...
 
Development of a next-generation (NGS) assay for pediatric, childhood, and yo...
Development of a next-generation (NGS) assay for pediatric, childhood, and yo...Development of a next-generation (NGS) assay for pediatric, childhood, and yo...
Development of a next-generation (NGS) assay for pediatric, childhood, and yo...
 
High content screening in MCF7 and MDA-MB231 cells show differential response...
High content screening in MCF7 and MDA-MB231 cells show differential response...High content screening in MCF7 and MDA-MB231 cells show differential response...
High content screening in MCF7 and MDA-MB231 cells show differential response...
 

Kürzlich hochgeladen

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 

Kürzlich hochgeladen (20)

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 

High Sensitivity Sanger Sequencing for Minor Variant Detection

  • 1. Harrison Leong, Edgar Schreiber, Stephan Berosik, Shiaw-Min Chen, Wallace George, Jeffrey Marks, Stephanie Schneider ThermoFisher Scientific, Genetic Sciences Division, 200 Oyster Point Blvd., South San Francisco, CA, 94080 RESULTS Table 1 shows the sensitivity and specificity for allele frequencies 5% and 10%. Although we have been able to detect variants at allele frequencies 0.6125%, 1%, 1.25%, 2%, and 2.5%, the algorithm did not meet the LOD criteria of 95% sensitivity and 99% specificity for these extremely low levels. INTRODUCTION Detecting minor genetic variants has become essential to cancer and infectious disease management. Many have turned to next generation sequencing to fill this need given the common perception that the limit of detection (LOD) for Sanger sequencing is somewhere between 15% to 25%1,2,3. We have discovered a software algorithmic solution to reduce this detection limit to 5% and have demonstrated detection at even lower allele frequencies. Standard Sanger sequencing protocols can be used and the method can generate the familiar electropherogram data display with noise substantially reduced. This opens up an alternative for detecting low level somatic variants. The key observation that enabled this development is that the noise underlying Sanger sequencing fluorescence data (traces) appears to be highly correlated to the primary sequence in the data. Figure 1 shows the electropherograms from two different samples: the control sample has the same primary sequence as the test sample which contains a few minor variants. CONCLUSIONS It should now be possible to achieve a reference-based limit of detection of 5% allelic proportion with standard Sanger sequencing protocols. Existing protocols for visually reviewing the results can also be used and are enhanced because the algorithm generates results in the form of familiar electropherograms for which the noise has been substantially diminished. These two features of the algorithm may give Sanger sequencing performance and/or economic advantages in some molecular diagnostic applications that require finding minor genetic variants. NOTE: Results on clinical samples can be found at www.thermofisher.com/sangeroncology. The algorithm has been embedded within ThermoFisher Scientific’s Minor Variant Finder software (www.thermofisher.com/mvf ). REFERENCES 1.  Lin, M.T. et al. (2014), American Journal of Clinical Pathology, June 2014; 141:856-866. 2.  Jancik, S. et al. (2012), Journal of Experimental & Clinical Cancer Research 2012; 31:79:1-13. 3.  Tsiatis, A.C. et al. (2010), Journal of Molecular Diagnostics, July 2010; 12:4:425-432. 4.  Wang G. and Guo L. (2013) Journal of Applied Mathematics, 2013; article 696491. High Sensitivity Sanger Sequencing for Minor Variant Detection Thermo Fisher Scientific • 5781 Van Allen Way • Carlsbad, CA 92008 • thermofisher.com TT27 These are the key steps in the noise minimization algorithm: a)  for each of control and test traces, find the range of base positions where the sequence data quality is consistently high; b)  find the intersection of the high-sequence-quality ranges between the control and test sample traces; do the following within that intersection: c)  remove the trace components associated with the primary bases leaving the non-primary traces; d)  locally expand or contract and/or strengthen or weaken the non- primary traces of the control sample to maximize correspondence between the non-primary traces of the control and test samples; e)  subtract the manipulated non-primary traces from the test sample traces; f)  suppress non-primary peaks that are obviously not variant peaks (set them to zero) based on several peak characteristics such as amplitude, width, alignment with the primary trace peak, etc. This process is applied to traces from both forward and reverse sequencing reactions. The outcome is noise minimized traces for forward and reverse traces of the test sample that can be displayed for review in the familiar electropherogram format. These traces are passed into the second stage of the algorithm to automatically detect variants. AUTOMATED VARIANT DETECTION For variant detection, the forward and reverse noise-minimized traces of the test sample are examined for any remaining peaks. These peaks are scrutinized by a set of five interconnected multi-variate classification functions to decide whether or not there is a bona fide variant at a given base position and its base identity. The final thresholds of four of these functions are optimized for classification accuracy using an algorithm based on swarm theory 4. DATA FOR DEVELOPING AND TESTING THE METHOD Samples came from 22 amplicons associated with eight different genes: TP53, KRAS, BRAF, EGFR, FLT3, RB1, CDH1, and ERBB2. Many of these were extracted from formalin-fixed, paraffin-embedded samples. Some were commercially available reference standards (Acrometrix), others were quantified using the RNase-P quantitative polymerase chain reaction assay and serially diluted. Allelic proportions spanned 0.6125% to 50%. These samples were amplified, sequenced, and pre-processed using standard protocols and tools for fluorescent dye terminator Sanger sequencing from Applied BiosystemsTM. A third of these data were used for developing the algorithms. Two thirds were used for evaluating the performance characteristics of the method. Figure 1. Noise underlying two different samples looks very similar when their primary sequences are the same. Control Sample Test Sample with Variants Figure 1: Electropherograms from two different physical samples showing the underlying noise; note the close similarity between the two. The bottom 200 relative fluorescence units (RFUs) is shown. The primary peaks are up at around 1000 RFUs. A two-part algorithm has been developed to exploit this observation. The first part minimizes the noise that underlies the traces. The second part detects variants, if any, in the noise- minimized traces. This communication describes the algorithmic details and shows test results. f)  Use a global optimization algorithm (one based on swarm theory was used) to find optimum final threshold values for the four discriminant functions of step (c) and (e). The classifiers of step (e) deliver the final judgment on whether a peak is associated with a variant or non-variant. Figure 2 summarizes the complete classification engine. MATERIALS AND METHODS NOISE MINIMIZATION For noise minimization, a model of the noise in the traces of the test sample is made from traces of the control sample and this model is subtracted from the traces of the test sample. The key steps of the variant detection algorithm, including construction of the classification engine, are as follows: a)  Compute metrics on trace peaks such as the location of a peak relative to that of the nearest primary base, symmetry of the peak, sharpness of a peak relative to that of its nearest primary base, etc.; b)  Classify the largest non-primary peaks based on each peak metric alone to the degree that a peak can be unambiguously classified in this manner; c)  Of those that cannot be classified in step (b), construct two discriminant functions, one based on peak metrics that combine forward and reverse information (x-strand), one based on peak metrics that do not combine the two (s-strand). Within each of these two categories, generate discriminant functions for all possible combinations of metrics belonging to the category and choose the function with the highest performance; d)  Use the s-strand classifier to generate additional peak metrics; e.g., the probability ratio between variant and non-variant peaks based on s-strand pre-thresholded output; e)  Make two additional discriminant functions: one for peaks categorized as variants by the x-strand classifier and the second for peaks categorized as non-variants by the x-strand classifier. Metrics of steps (c) and (d) are used to create these discriminant functions using the feature selection process of (c). INPUT DATA: Forward control Forward test Reverse control Reverse test Fwd and Rvs test, noise minimized Single-strand metrics: peak height, width, sharpness, symmetry, signal to noise, etc. Cross-strand metrics (combined fwd rvs information): base complementarity, relative peak amplitude, relative width, etc. Signal to noise outliers are variant candidates Classifier for clear cut cases Classifier based on cross-strand metrics Classifier to override cross-strand variant calls (all metrics) Classifier to override cross- strand non-variant calls (all metrics) OUTPUT RESULTS: Variant locations and base identities Meta-metrics from classifier based on single- strand metrics: var/non-var probability ratio, pre- thresholded score, etc. Figure 2. The classification engine for variant detection. Figure 2: Trace data enters at the upper left and detected variants, if any, are reported out at the bottom. The figure illustrates that the decision making process is layered so that easy decisions are made first and only the trace peaks that cannot be clearly classified are funneled down into the deeper levels of analysis. This allows the classifier at each level to concentrate on a smaller set of the data which may have a simpler statistical structure. Figure 3 shows results of applying noise minimization to the forward sequencing orientation of a sample with three variants at an allele frequency of 1.25%. The central panel shows the traces before minimization. The process clearly reveals the three variant peaks. The red marks in the bottom panel indicate where the automated variant detection algorithm called out variants. 1.25% Variant Test Sample Control Sample algorithm finds the variants KB Basecaller misses the variants 1.25% Test Sample, Noise Minimized Figure 3. Noise minimized trace example (bottom panel). Figure 3: Noise minimization reveals 1.25% minor variants deeply embedded in the noise underlying Sanger trace data. The high similarity in the noise between the control (top panel) and test (middle panel) traces allows much of the noise to be removed (bottom panel). TABLE 1: Algorithm performance for allele frequencies meeting LOD criteria Variant Level Sensitivity Specificity Datasets Total True Variants Total True Non-variants 5% 95.9% 99.8% 704 785 229623 10% 98.8% 99.8% 454 503 163037