SlideShare a Scribd company logo
1 of 11
Download to read offline
GIAB SV Data Jamboree @ NIST
PBHoney Spots Update
Will Salerno
9.15.2016
PBSuite
http://sourceforge.net/projects/pb-jelly/
Honey Spots
● Honey Spots is the “indel” caller
for Long-Read SV detection
○ Tails is “split read”
● Designed for smaller SVs
○ 50 bp to 2 Kbp
● Two components
○ SpotCaller: Discover
putative SVs
○ ConsensusCaller: Evaluate
SVs
Align Reads &
Create Error
Channels
Process Signal &
Create “Spots”
Identify Alt Reads
& Create
Consensus
Sequence
Remap Consensus
& Call SV
Honey Spots Update
● Performance Optimizations
○ NA12878 (41x) 2 days to 6
hours
● Less restrictive filtering
○ More sensitive calling
● “Tails” can contribute to Spots
signal
Align Reads &
Create Error
Channels
Process Signal &
Create “Spots”
Identify Alt Reads
& Create
Consensus
Sequence
Remap Consensus
& Report All Spots
Tails Calls <10
Kbp
Existing Data Sets
AJ Proband AJ Mother AJ Father NA12878 HS1011
Coverage 45x 19x 21x 41x 23x
● Eight short-read SV detection methods
● PBHoney (old version)
● 10x PacBio, 48x Short-Read, BioNano, aCGH
Honey Spots Performance
Sample
SVTyp
e SizeDist Count
TruthSet
Calls
TruthSet
Recovered
Recovery
Rate
HS1011
INS
(50, 100) 6,459 113 74 65.49%
(101, 500) 6,277 2,850 2,383 83.61%
(501, 1000) 673 305 241 79.02%
(1000, 2000) 103 259 192 74.13%
DEL
(50, 100) 5,405 25 19 76.00%
(101, 500) 4,067 3,159 2,582 81.73%
(501, 1000) 600 226 170 75.22%
(1000, 2000) 536 46 15 32.61%
NA12878
INS
(50, 100) 8,833 . . .
(101, 500) 8,460 . . .
(501, 1000) 676 . . .
(1000, 2000) 66 . . .
DEL
(50, 100) 5,010 2 2 100.00%
(101, 500) 3,930 1,484 1,446 97.44%
(501, 1000) 509 201 182 90.55%
(1000, 2000) 466 197 185 93.91%
AJ Trio Deletions: Trio Discovery
Remove loci with
any sample
represented
more than once
Do discovery
in Trio
Filter Proband
to
altZMWs >= 10
Merge Trio
With
50bp Bookends
Distance
Force Call
Missing in
Parents
Discovery
Filter
Proband
altZMWs
>=10 50bp Merge
Single
Sample
Filter
Present in
Proband
and
Parent(s)
Missing in
Parents
Discovery
but Forced
Total
Proband
with Parent
Support
Proband 10,753 8,137 7,785 7,305 6,175 886 7,061
Father 7,994 . 7,727 7,300 4,784 663 5,447
Mother 7,448 . 7,217 6,813 4,636 651 5,287
Total 26,195 23,579 11,896 11,369 6,175 886 7,061
Honey Force Calling
Candidate Regions
Identify Matching
Spots Reads Near
Region
Output Evidence
Identify Matching
Tails Reads Near
Region
Identify ‘Reference’
Supporting Reads
Spanning Region
● A Candidate Region is an SV’s location, type, size.
● Reads are fetched within Region ±BUFFER.
● Matching Reads are those having variant of the same type within ±SIZE and ±DISTANCE.
● Reference supporting Reads span Region and show no variant evidence.
● Looking for a minimum of one read.
AJ Trio Dels: Proband Discovery, Parent Force Calling
Do discovery
in Proband
Filter Proband
to
altZMWs >= 10
Force in Parents
Discovery
Filter Proband
altZMWs
>= 10
Forced in
Father
Forced in
Mother
Forced in
Parent(s)
Proband 10,753 8,137 6,268 6,206 7,565
AJ Trio Insertions: Trio Discovery
Discovery
Filter
Proband
altZMWs
>=10 50bp Merge
Single
Sample
Filter
Present in
Proband
and
Parent(s)
Missing in
Parents
Discovery
but Forced
Total
Proband
with Parent
Support
Proband 24,585 13,134 12,324 11,317 7,266 2,986 10,252
Father 11,758 . 11,236 10,303 5,632 2,322 7,954
Mother 10,633 . 10,146 9,253 5,344 2,308 7,652
Total 26,195 35,525 20,227 19,051 7,266 2,986 10,252
Remove loci with
any sample
represented
more than once
Do discovery
in Trio
Filter Proband
to
altZMWs >= 10
Merge Trio
With
50bp Bookends
Distance
Force Call
Missing in
Parents
AJ Trio Ins: Proband Discovery, Parents Force Calling
Discovery
Filter Proband
altZMWs
>= 10
Forced in
Father
Forced in
Mother
Forced in
Parent(s)
Proband 24,585 13,134 10,245 10,139 11,839
Do discovery
in Proband
Filter Proband
to
altZMWs >= 10
Force in Parents
Next-Gen Sequencing Informatics Group @ HGSC
● Bioinformatics Core for the Human Genome Sequencing Center
● Primary and Secondary Analysis for Production Pipelines
○ Illumina Fleet (X Ten, 2000/2500), PacBio (RS II and Sequel)
○ Research and CAP/CLIA
○ WGS, WES, Custom Capture, Clinical Panels
● Structural Variation
● Annotation
● Hadoop Data Warehouse
● EMR/EHR Integration
● 11 Members and Growing!
CHARGE

More Related Content

More from GenomeInABottle

More from GenomeInABottle (20)

GIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussionGIAB Technical Germline Benchmark roadmap discussion
GIAB Technical Germline Benchmark roadmap discussion
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Giab agbt small_var_2020
Giab agbt small_var_2020Giab agbt small_var_2020
Giab agbt small_var_2020
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GHGa4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
Ga4gh 2019 - Assuring data quality with benchmarking tools from GIAB and GA4GH
 
GIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant posterGIAB ASHG 2019 Structural Variant poster
GIAB ASHG 2019 Structural Variant poster
 
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATKGIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
GIAB GRC Workshop ASHG 2019 Billy Rowell Evaluation of v4 with CCS GATK
 
GIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant posterGIAB ASHG 2019 Small Variant poster
GIAB ASHG 2019 Small Variant poster
 
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant BenchmarkGRC GIAB Workshop ASHG 2019 Small Variant Benchmark
GRC GIAB Workshop ASHG 2019 Small Variant Benchmark
 
Jason Chin MHC diploid assembly
Jason Chin MHC diploid assemblyJason Chin MHC diploid assembly
Jason Chin MHC diploid assembly
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417GIAB and long reads for bio it world 190417
GIAB and long reads for bio it world 190417
 
New methods diploid assembly with graphs
New methods   diploid assembly with graphsNew methods   diploid assembly with graphs
New methods diploid assembly with graphs
 
How giab fits in the rest of the world seqc2 tumor normal
How giab fits in the rest of the world   seqc2 tumor normalHow giab fits in the rest of the world   seqc2 tumor normal
How giab fits in the rest of the world seqc2 tumor normal
 
New data from giab genomes pacbio ccs
New data from giab genomes   pacbio ccsNew data from giab genomes   pacbio ccs
New data from giab genomes pacbio ccs
 
New data from giab genomes strand-seq
New data from giab genomes   strand-seqNew data from giab genomes   strand-seq
New data from giab genomes strand-seq
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
New data from giab genomes intro and ultralong nanopore
New data from giab genomes   intro and ultralong nanoporeNew data from giab genomes   intro and ultralong nanopore
New data from giab genomes intro and ultralong nanopore
 

Recently uploaded

Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
adilkhan87451
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
mahaiklolahd
 

Recently uploaded (20)

Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service AvailableTrichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
 
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
Call Girls Vasai Virar Just Call 9630942363 Top Class Call Girl Service Avail...
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
 
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
Jogeshwari ! Call Girls Service Mumbai - 450+ Call Girl Cash Payment 90042684...
 
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur  Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Guntur  Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
 
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
 
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
 
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
 
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 

Sept2016 sv pb_honey

  • 1. GIAB SV Data Jamboree @ NIST PBHoney Spots Update Will Salerno 9.15.2016 PBSuite http://sourceforge.net/projects/pb-jelly/
  • 2. Honey Spots ● Honey Spots is the “indel” caller for Long-Read SV detection ○ Tails is “split read” ● Designed for smaller SVs ○ 50 bp to 2 Kbp ● Two components ○ SpotCaller: Discover putative SVs ○ ConsensusCaller: Evaluate SVs Align Reads & Create Error Channels Process Signal & Create “Spots” Identify Alt Reads & Create Consensus Sequence Remap Consensus & Call SV
  • 3. Honey Spots Update ● Performance Optimizations ○ NA12878 (41x) 2 days to 6 hours ● Less restrictive filtering ○ More sensitive calling ● “Tails” can contribute to Spots signal Align Reads & Create Error Channels Process Signal & Create “Spots” Identify Alt Reads & Create Consensus Sequence Remap Consensus & Report All Spots Tails Calls <10 Kbp
  • 4. Existing Data Sets AJ Proband AJ Mother AJ Father NA12878 HS1011 Coverage 45x 19x 21x 41x 23x ● Eight short-read SV detection methods ● PBHoney (old version) ● 10x PacBio, 48x Short-Read, BioNano, aCGH
  • 5. Honey Spots Performance Sample SVTyp e SizeDist Count TruthSet Calls TruthSet Recovered Recovery Rate HS1011 INS (50, 100) 6,459 113 74 65.49% (101, 500) 6,277 2,850 2,383 83.61% (501, 1000) 673 305 241 79.02% (1000, 2000) 103 259 192 74.13% DEL (50, 100) 5,405 25 19 76.00% (101, 500) 4,067 3,159 2,582 81.73% (501, 1000) 600 226 170 75.22% (1000, 2000) 536 46 15 32.61% NA12878 INS (50, 100) 8,833 . . . (101, 500) 8,460 . . . (501, 1000) 676 . . . (1000, 2000) 66 . . . DEL (50, 100) 5,010 2 2 100.00% (101, 500) 3,930 1,484 1,446 97.44% (501, 1000) 509 201 182 90.55% (1000, 2000) 466 197 185 93.91%
  • 6. AJ Trio Deletions: Trio Discovery Remove loci with any sample represented more than once Do discovery in Trio Filter Proband to altZMWs >= 10 Merge Trio With 50bp Bookends Distance Force Call Missing in Parents Discovery Filter Proband altZMWs >=10 50bp Merge Single Sample Filter Present in Proband and Parent(s) Missing in Parents Discovery but Forced Total Proband with Parent Support Proband 10,753 8,137 7,785 7,305 6,175 886 7,061 Father 7,994 . 7,727 7,300 4,784 663 5,447 Mother 7,448 . 7,217 6,813 4,636 651 5,287 Total 26,195 23,579 11,896 11,369 6,175 886 7,061
  • 7. Honey Force Calling Candidate Regions Identify Matching Spots Reads Near Region Output Evidence Identify Matching Tails Reads Near Region Identify ‘Reference’ Supporting Reads Spanning Region ● A Candidate Region is an SV’s location, type, size. ● Reads are fetched within Region ±BUFFER. ● Matching Reads are those having variant of the same type within ±SIZE and ±DISTANCE. ● Reference supporting Reads span Region and show no variant evidence. ● Looking for a minimum of one read.
  • 8. AJ Trio Dels: Proband Discovery, Parent Force Calling Do discovery in Proband Filter Proband to altZMWs >= 10 Force in Parents Discovery Filter Proband altZMWs >= 10 Forced in Father Forced in Mother Forced in Parent(s) Proband 10,753 8,137 6,268 6,206 7,565
  • 9. AJ Trio Insertions: Trio Discovery Discovery Filter Proband altZMWs >=10 50bp Merge Single Sample Filter Present in Proband and Parent(s) Missing in Parents Discovery but Forced Total Proband with Parent Support Proband 24,585 13,134 12,324 11,317 7,266 2,986 10,252 Father 11,758 . 11,236 10,303 5,632 2,322 7,954 Mother 10,633 . 10,146 9,253 5,344 2,308 7,652 Total 26,195 35,525 20,227 19,051 7,266 2,986 10,252 Remove loci with any sample represented more than once Do discovery in Trio Filter Proband to altZMWs >= 10 Merge Trio With 50bp Bookends Distance Force Call Missing in Parents
  • 10. AJ Trio Ins: Proband Discovery, Parents Force Calling Discovery Filter Proband altZMWs >= 10 Forced in Father Forced in Mother Forced in Parent(s) Proband 24,585 13,134 10,245 10,139 11,839 Do discovery in Proband Filter Proband to altZMWs >= 10 Force in Parents
  • 11. Next-Gen Sequencing Informatics Group @ HGSC ● Bioinformatics Core for the Human Genome Sequencing Center ● Primary and Secondary Analysis for Production Pipelines ○ Illumina Fleet (X Ten, 2000/2500), PacBio (RS II and Sequel) ○ Research and CAP/CLIA ○ WGS, WES, Custom Capture, Clinical Panels ● Structural Variation ● Annotation ● Hadoop Data Warehouse ● EMR/EHR Integration ● 11 Members and Growing! CHARGE