SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Benchmarking
with
Genome In A
Bottle
GIAB Improves Confidence in Genome Sequencing
and Variant Calling
REFERENCE
MATERIALS
CHARACTERIZATIONS
(BENCHMARK SETS)
REFERENCE DATA BENCHMARKING
METHODS
2
Genome
Sequencing and
Variant Calling
3
GIAB Reference
Materials
4
GIAB has characterized variants in 7 human
genomes
5
HG001*
Chinese Trio
NA12878
HG002*
HG003* HG004*
AJ Trio
HG006 HG007
HG005*
*NIST RMs developed from large batches of DNA
GIAB Reference
Data
6
Public Data Sources
• NIH Hosted FTP Site https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/
• NIH SRA https://www.ncbi.nlm.nih.gov/bioproject/200694
• HPRC S3 Bucket https://github.com/human-pangenomics/HG002_Data_Freeze_v1.0
7
8
GIAB Data
Indexes on
Github
9
https://github.com/genome-in-a-bottle/giab_data_indexes
Work In Progress - Data Registry
Queryable database with
pointers to publicly
available GIAB data
along with summary
statistics
Data Types
Sample
FASTQs
BAMs
VCFs
Capturing methods and
linking datasets for data
provenance
10
GIAB
Characterizations
11
12
Small Variant
Integration
Process
13
Benchmark
Regions
Reliably identifies false positives
Matching
variants
assumed true
positives
Variants from
any method
Benchmark
Variants
Design of GIAB benchmark
Variants not assessed
Reliably identifies false negatives
GRCh37 and GRCh38
Reliable IDentification of Errors (RIDE)
14
v4.2.1 Small Variant Benchmark used Long and Linked Reads
15
Reference Build Benchmark Set Reference Coverage SNVs Indels Base pairs in Seg Dups and low mappability
GRCh37 v3.3.2 87.8 3,048,869 464,463 57,277,670
GRCh37 v4.2.1 94.1 3,353,881 522,388 133,848,288
GRCh38 v3.3.2 85.4 3,030,495 475,332 65,714,199
GRCh38 v4.2.1 92.2 3,367,208 525,545 145,585,710
Wagner et al, https://doi.org/10.1101/2020.07.24.212712
Structural
Variant
Benchmark Set
16
Zook, J.M., Hansen, N.F., Olson, N.D. et al. A robust benchmark for detection of germline large deletions and
insertions. Nat Biotechnol 38, 1347–1355 (2020). https://doi.org/10.1038/s41587-020-0538-8
GIAB
Benchmarking
Methods
17
Small Variant Benchmarking Highlights (TLDR)
Best practices for
benchmarking
germline variant
calling
https://rdcu.be/bVtIF
Supplemental Table 2
summarizes best
practices
Hap.py - best
practices
implementation
Command line -
https://github.com/Illumi
na/hap.py
Graphical interface –
https://precision.fda.gov/
HappyR – R
package for hap.py
results
Github
https://github.com/Illumi
na/happyR
www.slideshare.ne
t/genomeinabottle
Benchmarking Process
19
Best Practices
Summary
Benchmark Sets
Stringency of variant comparison
Variant comparison tools
Manual Curation
Metric Interpretation
Stratifications
Confidence Intervals
Additional Benchmarking Approaches
Applying
Best Practices
22
Best Practices for Benchmarking Small Variants
23
https://github.com/ga4gh/benchmarking-tools
Paper: https://rdcu.be/bqpDT https://precision.fda.gov/
Stratified Performance
Metrics
• Plot metric on a phred scale for
better separation of metric
values > 99%.
• Precision = TP/(TP + FP)
• Recall = TP/ (TP + FN)
• Confidence intervals indicate
uncertainty and help account
for differences in number of
variants per stratification.
INDEL SNP
Precision
Recall
Difficult
Homopol
Not
in
Difficult
TR
and
Homopol
CDS
chainSelf
lowmap
and
segdups
lowmap
SegDups
chainSelf
>10kb
SegDups
>
10kb
Difficult
Homopol
Not
in
Difficult
TR
and
Homopol
CDS
chainSelf
lowmap
and
segdups
lowmap
SegDups
chainSelf
>10kb
SegDups
>
10kb
99
99.9
99.99
99
99.9
99.99
Genomic Context
Metric
(%
phred
scale)
GIAB ID HG003 HG004 Stratification Type all notin
Pairwise
callset
comparison
L1H
L1H
quadTR >200bp
nonuniuqe l250m0e0
nonuniuqe l250m0e0
notin Not in All Difficult
notin Not in All Difficult
TR 201bp − 10kb
L1H
L1H
diTR 51−200bp
diTR 51−200bp
triTR 51−200bp
triTR 51−200bp
nonuniuqe l250m0e0
nonuniuqe l250m0e0
notin Not in All Difficult
L1H
notin Not in All Difficult
notin Not in All Difficult
L1H
MHC
MHC
diTR 51−200bp
diTR 51−200bp
quadTR 51−200bp
triTR 51−200bp
triTR 51−200bp
notin Not in All Difficult
notin Not in All Difficult
Precision Recall
INDEL
SNP
0 90 99 99.9 99.99 0 90 99 99.9 99.99
0
90
99
99.9
99.99
0
90
99
99.9
99.99
DeepVariant_PacBio
DeepVariant_ILL
strat_group
All Diff
LowComplexity
Map and SegDups
mappability
Other Diff
SegDups
NA
(Optional) Optimization
– Identifying biases
responsible for
performing
stratifications.
Benchmarking Take Home Messages
Kruche et al. URL, is a great resource for germ-line small variant benchmarking.
Appropriate data visualizations are critical to interpreting benchmarking results.
Use manual curation to evaluate benchmarking results
Resources available for benchmarking small and structural variants against
GRCh37 and GRCh38.
Collaborating with
FDA to use GIAB
benchmark to
inspire new
methods
29
https://precision.fda.gov/challenges/10
30
Challenge Results
• Received 64 submissions from 20
participants
• Most submissions used deep-learning-
based variant-calling methods
• Submissions using multiple
technologies outperformed single
technology submissions
• Submission performance varied by
genomic stratification
31
W
W
W
W
W
W
W
W
W W
W
W
W
W
Sentieon
Roche Sequencing Solutions
The Genomics Team in Google Health Sentieon
Sentieon
DRAGEN
Sentieon
Roche Sequencing Solutions
Sentieon
Seven Bridges Genomics
The UCSC CGL and Google Health
Wang Genomics Lab
DRAGEN
The UCSC CGL and Google Health
0
90
99
99.9
Dif
f
i
cult-to-Map
Regions
All Benchmark
Regions
MHC
Genomic Regions
F1
%
Technology ILLUMINA MULTI ONT PACBIO
Results Con’t
• Updated stratifications enable
comparison of method strengths
• Graph-based variant calling enables high
accuracy of short read variant calls in the
difficult MHC region.
• Improved benchmark sets and
stratifications reveal significant
progress in DNA sequencing and
variant calling since the 2016 challenge
32
Future of
Genome In A
Bottle
33
DEvelopment
Framework for
Assembly Based
Bechmarks
(DEFRABB)
34
Developing benchmarks on
new references using
assemblies
35
• Telomere-to-Telomere
Consortium generated a
new reference T2T-
CHM13
• Developed CMRG
benchmark on T2T-
CHM13 using the diploid
assembly of HG002
similar to benchmarks on
GRCh37 and GRCh38
Assembly-Based Benchmark Process
36
Assembly-Based Benchmark Process
37 - Minimap2 for Assembly –Assembly alignment
- Variants called and diploid assembled regions
identified using dipcall v0.3
Assembly-Based Benchmark Process
38
VCF formatting and modifications for use in
benchmarking.
Assembly-Based Benchmark Process
39 Exclude regions from dip.bed (assembled regions)
that are problematic for small variant calling and
comparison due to SVs and gaps in reference or
alignment
Take-home messages
REFERENCE
MATERIALS
AVAILABLE FOR 5
INDIVIDUALS
SMALL VARIANT
BENCHMARK SETS
FOR 7 INDIVIDUALS
FOR GRCH37 AND
GRCH38, SV
BENCHMARK FOR
ONE INDIVIDUAL FOR
GRCH37
BEST PRACTICES
ESTABLISHED FOR
SMALL VARIANT
BENCHMARKING
CURRENT EFFORTS
FOCUS ON
DEVELOPING SMALL
VARIANT AND
STRUCTURAL
VARIANT
BENCHMARK SET
USING DIPLOID
ASSEMBLIES
40
Acknowledgment of many GIAB contributors
41
Government
Clinical Laboratories Academic Laboratories
Bioinformatics developers
NGS technology developers
Reference samples
* Funders
*
*
Interesting in getting involved?
42
www.genomeinabottle.org - sign up for general
GIAB and Analysis Team google groups
GIAB slides:
www.slideshare.net/genomeinabottle
Public, Unembargoed
Data:
github.com/genome-
in-a-bottle
We are hiring!
Data Manager,
Machine learning,
diploid assembly,
cancer genomes,
data science,
other ‘omics, …

Weitere ähnliche Inhalte

Was ist angesagt?

Activation tagging in plants
Activation tagging in plantsActivation tagging in plants
Activation tagging in plants
Amandeep Kaur
 
High Throughput Plant Phenotyping in Crop Improvement
High Throughput Plant Phenotyping in Crop ImprovementHigh Throughput Plant Phenotyping in Crop Improvement
High Throughput Plant Phenotyping in Crop Improvement
Khushbu
 

Was ist angesagt? (20)

Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
NGS File formats
NGS File formatsNGS File formats
NGS File formats
 
Activation tagging in plants
Activation tagging in plantsActivation tagging in plants
Activation tagging in plants
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Molecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breedingMolecular marker and its application to genome mapping and molecular breeding
Molecular marker and its application to genome mapping and molecular breeding
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
 
Snp genotyping
Snp genotypingSnp genotyping
Snp genotyping
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
High Throughput Plant Phenotyping in Crop Improvement
High Throughput Plant Phenotyping in Crop ImprovementHigh Throughput Plant Phenotyping in Crop Improvement
High Throughput Plant Phenotyping in Crop Improvement
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
gene stacking in crop plants final
gene stacking in crop plants finalgene stacking in crop plants final
gene stacking in crop plants final
 
Lecture 6 candidate gene association full
Lecture 6 candidate gene association fullLecture 6 candidate gene association full
Lecture 6 candidate gene association full
 
Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement Tilling and Ecotilling for crop improvement
Tilling and Ecotilling for crop improvement
 
Human encodeproject
Human encodeprojectHuman encodeproject
Human encodeproject
 
cisgenesis and intragenesis
cisgenesis and intragenesiscisgenesis and intragenesis
cisgenesis and intragenesis
 
Development of mapping population for linkage analysis in ornamental crops
Development of mapping population for linkage analysis in ornamental cropsDevelopment of mapping population for linkage analysis in ornamental crops
Development of mapping population for linkage analysis in ornamental crops
 
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic SequencesThe NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
 

Ähnlich wie Benchmarking with GIAB 220907

Ähnlich wie Benchmarking with GIAB 220907 (20)

GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
Giab agbt small_var_2019
Giab agbt small_var_2019Giab agbt small_var_2019
Giab agbt small_var_2019
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
New methods draft v4alpha small variant benchmark
New methods   draft v4alpha small variant benchmarkNew methods   draft v4alpha small variant benchmark
New methods draft v4alpha small variant benchmark
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
CRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and HowCRISPR Screening: the What, Why and How
CRISPR Screening: the What, Why and How
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Large Scale PCA Analysis in SVS
Large Scale PCA Analysis in SVSLarge Scale PCA Analysis in SVS
Large Scale PCA Analysis in SVS
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 

Mehr von GenomeInABottle

Mehr von GenomeInABottle (17)

GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 
How giab fits in the rest of the world mdic somatic reference samples
How giab fits in the rest of the world   mdic somatic reference samplesHow giab fits in the rest of the world   mdic somatic reference samples
How giab fits in the rest of the world mdic somatic reference samples
 
How giab fits in the rest of the world telomere to telomere consortium
How giab fits in the rest of the world   telomere to telomere consortiumHow giab fits in the rest of the world   telomere to telomere consortium
How giab fits in the rest of the world telomere to telomere consortium
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...
 
How giab fits in the rest of the world introduction
How giab fits in the rest of the world introductionHow giab fits in the rest of the world introduction
How giab fits in the rest of the world introduction
 

Kürzlich hochgeladen

💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
Sheetaleventcompany
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
adilkhan87451
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Sheetaleventcompany
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
mahaiklolahd
 

Kürzlich hochgeladen (20)

Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
Saket * Call Girls in Delhi - Phone 9711199012 Escorts Service at 6k to 50k a...
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
 
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
 
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
 
Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
 
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
Call Girls Kolkata Kalikapur 💯Call Us 🔝 8005736733 🔝 💃 Top Class Call Girl Se...
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
 
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
 
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
 
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
 

Benchmarking with GIAB 220907

  • 2. GIAB Improves Confidence in Genome Sequencing and Variant Calling REFERENCE MATERIALS CHARACTERIZATIONS (BENCHMARK SETS) REFERENCE DATA BENCHMARKING METHODS 2
  • 5. GIAB has characterized variants in 7 human genomes 5 HG001* Chinese Trio NA12878 HG002* HG003* HG004* AJ Trio HG006 HG007 HG005* *NIST RMs developed from large batches of DNA
  • 7. Public Data Sources • NIH Hosted FTP Site https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/ • NIH SRA https://www.ncbi.nlm.nih.gov/bioproject/200694 • HPRC S3 Bucket https://github.com/human-pangenomics/HG002_Data_Freeze_v1.0 7
  • 8. 8
  • 10. Work In Progress - Data Registry Queryable database with pointers to publicly available GIAB data along with summary statistics Data Types Sample FASTQs BAMs VCFs Capturing methods and linking datasets for data provenance 10
  • 12. 12
  • 14. Benchmark Regions Reliably identifies false positives Matching variants assumed true positives Variants from any method Benchmark Variants Design of GIAB benchmark Variants not assessed Reliably identifies false negatives GRCh37 and GRCh38 Reliable IDentification of Errors (RIDE) 14
  • 15. v4.2.1 Small Variant Benchmark used Long and Linked Reads 15 Reference Build Benchmark Set Reference Coverage SNVs Indels Base pairs in Seg Dups and low mappability GRCh37 v3.3.2 87.8 3,048,869 464,463 57,277,670 GRCh37 v4.2.1 94.1 3,353,881 522,388 133,848,288 GRCh38 v3.3.2 85.4 3,030,495 475,332 65,714,199 GRCh38 v4.2.1 92.2 3,367,208 525,545 145,585,710 Wagner et al, https://doi.org/10.1101/2020.07.24.212712
  • 16. Structural Variant Benchmark Set 16 Zook, J.M., Hansen, N.F., Olson, N.D. et al. A robust benchmark for detection of germline large deletions and insertions. Nat Biotechnol 38, 1347–1355 (2020). https://doi.org/10.1038/s41587-020-0538-8
  • 18. Small Variant Benchmarking Highlights (TLDR) Best practices for benchmarking germline variant calling https://rdcu.be/bVtIF Supplemental Table 2 summarizes best practices Hap.py - best practices implementation Command line - https://github.com/Illumi na/hap.py Graphical interface – https://precision.fda.gov/ HappyR – R package for hap.py results Github https://github.com/Illumi na/happyR www.slideshare.ne t/genomeinabottle
  • 20.
  • 21. Best Practices Summary Benchmark Sets Stringency of variant comparison Variant comparison tools Manual Curation Metric Interpretation Stratifications Confidence Intervals Additional Benchmarking Approaches
  • 23. Best Practices for Benchmarking Small Variants 23 https://github.com/ga4gh/benchmarking-tools Paper: https://rdcu.be/bqpDT https://precision.fda.gov/
  • 24.
  • 25. Stratified Performance Metrics • Plot metric on a phred scale for better separation of metric values > 99%. • Precision = TP/(TP + FP) • Recall = TP/ (TP + FN) • Confidence intervals indicate uncertainty and help account for differences in number of variants per stratification. INDEL SNP Precision Recall Difficult Homopol Not in Difficult TR and Homopol CDS chainSelf lowmap and segdups lowmap SegDups chainSelf >10kb SegDups > 10kb Difficult Homopol Not in Difficult TR and Homopol CDS chainSelf lowmap and segdups lowmap SegDups chainSelf >10kb SegDups > 10kb 99 99.9 99.99 99 99.9 99.99 Genomic Context Metric (% phred scale) GIAB ID HG003 HG004 Stratification Type all notin
  • 26. Pairwise callset comparison L1H L1H quadTR >200bp nonuniuqe l250m0e0 nonuniuqe l250m0e0 notin Not in All Difficult notin Not in All Difficult TR 201bp − 10kb L1H L1H diTR 51−200bp diTR 51−200bp triTR 51−200bp triTR 51−200bp nonuniuqe l250m0e0 nonuniuqe l250m0e0 notin Not in All Difficult L1H notin Not in All Difficult notin Not in All Difficult L1H MHC MHC diTR 51−200bp diTR 51−200bp quadTR 51−200bp triTR 51−200bp triTR 51−200bp notin Not in All Difficult notin Not in All Difficult Precision Recall INDEL SNP 0 90 99 99.9 99.99 0 90 99 99.9 99.99 0 90 99 99.9 99.99 0 90 99 99.9 99.99 DeepVariant_PacBio DeepVariant_ILL strat_group All Diff LowComplexity Map and SegDups mappability Other Diff SegDups NA
  • 27. (Optional) Optimization – Identifying biases responsible for performing stratifications.
  • 28. Benchmarking Take Home Messages Kruche et al. URL, is a great resource for germ-line small variant benchmarking. Appropriate data visualizations are critical to interpreting benchmarking results. Use manual curation to evaluate benchmarking results Resources available for benchmarking small and structural variants against GRCh37 and GRCh38.
  • 29. Collaborating with FDA to use GIAB benchmark to inspire new methods 29 https://precision.fda.gov/challenges/10
  • 30. 30
  • 31. Challenge Results • Received 64 submissions from 20 participants • Most submissions used deep-learning- based variant-calling methods • Submissions using multiple technologies outperformed single technology submissions • Submission performance varied by genomic stratification 31 W W W W W W W W W W W W W W Sentieon Roche Sequencing Solutions The Genomics Team in Google Health Sentieon Sentieon DRAGEN Sentieon Roche Sequencing Solutions Sentieon Seven Bridges Genomics The UCSC CGL and Google Health Wang Genomics Lab DRAGEN The UCSC CGL and Google Health 0 90 99 99.9 Dif f i cult-to-Map Regions All Benchmark Regions MHC Genomic Regions F1 % Technology ILLUMINA MULTI ONT PACBIO
  • 32. Results Con’t • Updated stratifications enable comparison of method strengths • Graph-based variant calling enables high accuracy of short read variant calls in the difficult MHC region. • Improved benchmark sets and stratifications reveal significant progress in DNA sequencing and variant calling since the 2016 challenge 32
  • 33. Future of Genome In A Bottle 33
  • 35. Developing benchmarks on new references using assemblies 35 • Telomere-to-Telomere Consortium generated a new reference T2T- CHM13 • Developed CMRG benchmark on T2T- CHM13 using the diploid assembly of HG002 similar to benchmarks on GRCh37 and GRCh38
  • 37. Assembly-Based Benchmark Process 37 - Minimap2 for Assembly –Assembly alignment - Variants called and diploid assembled regions identified using dipcall v0.3
  • 38. Assembly-Based Benchmark Process 38 VCF formatting and modifications for use in benchmarking.
  • 39. Assembly-Based Benchmark Process 39 Exclude regions from dip.bed (assembled regions) that are problematic for small variant calling and comparison due to SVs and gaps in reference or alignment
  • 40. Take-home messages REFERENCE MATERIALS AVAILABLE FOR 5 INDIVIDUALS SMALL VARIANT BENCHMARK SETS FOR 7 INDIVIDUALS FOR GRCH37 AND GRCH38, SV BENCHMARK FOR ONE INDIVIDUAL FOR GRCH37 BEST PRACTICES ESTABLISHED FOR SMALL VARIANT BENCHMARKING CURRENT EFFORTS FOCUS ON DEVELOPING SMALL VARIANT AND STRUCTURAL VARIANT BENCHMARK SET USING DIPLOID ASSEMBLIES 40
  • 41. Acknowledgment of many GIAB contributors 41 Government Clinical Laboratories Academic Laboratories Bioinformatics developers NGS technology developers Reference samples * Funders * *
  • 42. Interesting in getting involved? 42 www.genomeinabottle.org - sign up for general GIAB and Analysis Team google groups GIAB slides: www.slideshare.net/genomeinabottle Public, Unembargoed Data: github.com/genome- in-a-bottle We are hiring! Data Manager, Machine learning, diploid assembly, cancer genomes, data science, other ‘omics, …