SlideShare ist ein Scribd-Unternehmen logo
1 von 6
GIAB Analysis Team: SNP/indel
update and SV comparisons
Justin Zook
January 28, 2016
Overview of SNP/indel integration process
Find sensitive
variant calls
and callable
regions for
each dataset
Find
“consensus”
calls with
support from
2+
technologies
(and no
other
technologies
disagree)
Use
“consensus”
calls to train
simple one-
class model
for each
dataset and
find
“outliers”
that are less
trustworthy
for each
dataset
Find high-
confidence
calls by using
callable
regions and
“outliers” to
arbitrate
between
datasets
when they
disagree
Find high-
confidence
regions by
taking union
of callable
regions and
subtracting
uncertain
variants and
difficult
regions
Not yet finalized
Most useful to
others to make
easier to use?
Not yet on DNAnexus
Preliminary comparisons to 2.19 on
chr20 for NA12878
V2.19
• Bases in 2.19 bed: 51.8Mbp
• Total calls: 73412 (9k indels; 2k in
homopol>10)
• Total calls in 2.19 bed: 73412 (9k
indels)
• Concordant calls: 71669
• Concordant in 3.0 bed: 67657
• FPs in both beds: 0
• FNs in both beds: 9
• Genotype errors: 0
• Allele errors: 1
• Extra calls outside 3.0 bed: 1674
(708 SNPs)
V3.0
• Bases in 3.0 bed: 51.1Mbp
• Total calls: 86886 (13k indels; 4k in
homopol>10)
• Total calls in 3.0 bed: 70202 (7.4k
indels); 2platforms: 64617 (5k
indels)
• Concordant calls: 71669
• Concordant in 3.0 bed: 67657
• FPs in both beds: 0
• FNs in both beds: 0
• Genotype errors: 3
• Allele errors: 0
• Extra calls outside 2.19 bed: 2463
(1285 SNPs)
How can we add more difficult
calls/regions to our high-confidence
set?
Develop new method
Confirm subset of calls (e.g., manual inspection of multiple datasets/individuals or targeted
experimental validation)
Make calls available for community curation
Others compare and submit feedback about curation results
NIST critically evaluates results for integration into the GIAB callsets
Potential Breakout discussions
• Benchmarking SVs
– Can we use existing tools?
– What should the performance metrics be?
– How stringent is the matching (e.g., correct type, correct size, correct
breakpoints, correct sequence)?
• Confirmation/Validation of SVs
– Design questions for manual inspectors
– When is targeted experimental validation needed/useful?
– Randomly selected vs. stratified by size/type/difficulty…
• How to establish benchmark SVs?
– How many levels of confidence?
• Can we establish confident regions that do not have SVs in order to assess
FP rates?
• Ideas for a hackathon adjacent to August workshop
– Manual curation of SVs and other difficult variants/regions
– SV benchmarking tools
– Manual curation tools for SNPs, indels, and/or SVs
– …
• Other ideas?
Actual Breakout discussions
• What criteria should we use to decide when 2 SVs
should be considered to be the “same” and merged?
– For establishing the benchmark
– When comparing to the benchmark
• How should we confirm/validate candidate SVs calls
and establish benchmark SVs?
– questions for manual inspectors
– When is targeted experimental validation needed/useful?
– Randomly selected vs. stratified by size/type/difficulty…
– How many levels of confidence?
• How can we utilize new sophisticated variant
comparison tools to improve our benchmark SNP/indel
callsets and how can we develop high-confidence calls
for GRCh38?

Weitere ähnliche Inhalte

Was ist angesagt?

Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
GenomeInABottle
 

Was ist angesagt? (20)

Giab workshop intro 180125
Giab workshop intro 180125Giab workshop intro 180125
Giab workshop intro 180125
 
2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin2017 amp benchmarking_poster_justin
2017 amp benchmarking_poster_justin
 
Aug2015 salit standards architecture
Aug2015 salit standards architectureAug2015 salit standards architecture
Aug2015 salit standards architecture
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
Jan2016 horizon GIAB
Jan2016 horizon GIABJan2016 horizon GIAB
Jan2016 horizon GIAB
 
Aug2015 Giab nist integration methods
Aug2015 Giab nist integration methodsAug2015 Giab nist integration methods
Aug2015 Giab nist integration methods
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
Giab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGiab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptx
 
Jan2016 rm selection and design breakout summary
Jan2016 rm selection and design breakout summaryJan2016 rm selection and design breakout summary
Jan2016 rm selection and design breakout summary
 
161115 precision fda giab
161115 precision fda giab161115 precision fda giab
161115 precision fda giab
 
Giab ashg 2017
Giab ashg 2017Giab ashg 2017
Giab ashg 2017
 
Aug2015 horizon diagnostics
Aug2015 horizon diagnosticsAug2015 horizon diagnostics
Aug2015 horizon diagnostics
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
 
170120 giab stanford genetics seminar
170120 giab stanford genetics seminar170120 giab stanford genetics seminar
170120 giab stanford genetics seminar
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Giab product and tool roadmap small variants
Giab product and tool roadmap   small variantsGiab product and tool roadmap   small variants
Giab product and tool roadmap small variants
 
Hansen SVanalyzer Progress toward precision in analysis of Genomic SVs
Hansen SVanalyzer Progress toward precision in analysis of Genomic SVsHansen SVanalyzer Progress toward precision in analysis of Genomic SVs
Hansen SVanalyzer Progress toward precision in analysis of Genomic SVs
 

Ähnlich wie Giab jan2016 analysis team breakout SNP indel update zook

Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
Kimin Lee
 
Advanced topics research
Advanced topics researchAdvanced topics research
Advanced topics research
kieran122
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
CSIRO
 
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Miminten
 

Ähnlich wie Giab jan2016 analysis team breakout SNP indel update zook (20)

Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
 
Advanced topics research
Advanced topics researchAdvanced topics research
Advanced topics research
 
Using nvivo to tell the story, the power of coding
Using nvivo to tell the story, the power of codingUsing nvivo to tell the story, the power of coding
Using nvivo to tell the story, the power of coding
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
Using NVivo to tell the story - the power of coding
Using NVivo to tell the story - the power of codingUsing NVivo to tell the story - the power of coding
Using NVivo to tell the story - the power of coding
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
 
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ...
 Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ... Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ...
Deep Learning-Based Opinion Mining for Bitcoin Price Prediction with Joyesh ...
 
FutureOfTesting2008
FutureOfTesting2008FutureOfTesting2008
FutureOfTesting2008
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
Finding the Perfect Donor Database in an Imperfect World (11NTCDB)
 

Mehr von GenomeInABottle

Mehr von GenomeInABottle (20)

2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023GIAB Tumor Normal ASHG 2023
GIAB Tumor Normal ASHG 2023
 
Stratomod ASHG 2023
Stratomod ASHG 2023Stratomod ASHG 2023
Stratomod ASHG 2023
 
GIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdfGIAB_ASHG_JZook_2023.pdf
GIAB_ASHG_JZook_2023.pdf
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 

Kürzlich hochgeladen

Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Dipal Arora
 

Kürzlich hochgeladen (20)

Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
 
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
 
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bangalore Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
 
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
 

Giab jan2016 analysis team breakout SNP indel update zook

  • 1. GIAB Analysis Team: SNP/indel update and SV comparisons Justin Zook January 28, 2016
  • 2. Overview of SNP/indel integration process Find sensitive variant calls and callable regions for each dataset Find “consensus” calls with support from 2+ technologies (and no other technologies disagree) Use “consensus” calls to train simple one- class model for each dataset and find “outliers” that are less trustworthy for each dataset Find high- confidence calls by using callable regions and “outliers” to arbitrate between datasets when they disagree Find high- confidence regions by taking union of callable regions and subtracting uncertain variants and difficult regions Not yet finalized Most useful to others to make easier to use? Not yet on DNAnexus
  • 3. Preliminary comparisons to 2.19 on chr20 for NA12878 V2.19 • Bases in 2.19 bed: 51.8Mbp • Total calls: 73412 (9k indels; 2k in homopol>10) • Total calls in 2.19 bed: 73412 (9k indels) • Concordant calls: 71669 • Concordant in 3.0 bed: 67657 • FPs in both beds: 0 • FNs in both beds: 9 • Genotype errors: 0 • Allele errors: 1 • Extra calls outside 3.0 bed: 1674 (708 SNPs) V3.0 • Bases in 3.0 bed: 51.1Mbp • Total calls: 86886 (13k indels; 4k in homopol>10) • Total calls in 3.0 bed: 70202 (7.4k indels); 2platforms: 64617 (5k indels) • Concordant calls: 71669 • Concordant in 3.0 bed: 67657 • FPs in both beds: 0 • FNs in both beds: 0 • Genotype errors: 3 • Allele errors: 0 • Extra calls outside 2.19 bed: 2463 (1285 SNPs)
  • 4. How can we add more difficult calls/regions to our high-confidence set? Develop new method Confirm subset of calls (e.g., manual inspection of multiple datasets/individuals or targeted experimental validation) Make calls available for community curation Others compare and submit feedback about curation results NIST critically evaluates results for integration into the GIAB callsets
  • 5. Potential Breakout discussions • Benchmarking SVs – Can we use existing tools? – What should the performance metrics be? – How stringent is the matching (e.g., correct type, correct size, correct breakpoints, correct sequence)? • Confirmation/Validation of SVs – Design questions for manual inspectors – When is targeted experimental validation needed/useful? – Randomly selected vs. stratified by size/type/difficulty… • How to establish benchmark SVs? – How many levels of confidence? • Can we establish confident regions that do not have SVs in order to assess FP rates? • Ideas for a hackathon adjacent to August workshop – Manual curation of SVs and other difficult variants/regions – SV benchmarking tools – Manual curation tools for SNPs, indels, and/or SVs – … • Other ideas?
  • 6. Actual Breakout discussions • What criteria should we use to decide when 2 SVs should be considered to be the “same” and merged? – For establishing the benchmark – When comparing to the benchmark • How should we confirm/validate candidate SVs calls and establish benchmark SVs? – questions for manual inspectors – When is targeted experimental validation needed/useful? – Randomly selected vs. stratified by size/type/difficulty… – How many levels of confidence? • How can we utilize new sophisticated variant comparison tools to improve our benchmark SNP/indel callsets and how can we develop high-confidence calls for GRCh38?