SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
ASHG - GRC Workshop
Tina Lindsay
ASHG Oct 6, 2015
The Human Reference is Not Complete
• Reference has been found to not be optimal in some
regions
• Structural variation makes it difficult to assemble a truly
representative genome when using a diploid sample
• Some regions were recalcitrant to closure with technology
and resources available at the time
• Additional sequences are needed to capture the full range
of diversity in humans
AC074378.4
AC079749.5
AC134921.2
AC147055.2
AC140484.1
AC019173.4
AC093720.2
AC021146.7
NCBI36NC_000004.10 (chr4) Tiling Path
Xue Y et al, 2008
TMPRSS11E TMPRSS11E2
GRCh37NC_000004.11 (chr4) Tiling Path
AC074378.4
AC079749.5
AC134921.1
AC147055.2
AC093720.2
AC021146.7
TMPRSS11E
GRCh37: NT_167250.1 (UGT2B17 alternate locus)
AC074378.4
AC140484.1
AC019173.4
AC226496.2
AC021146.7
TMPRSS11E2
UGT2B17 – Conflicting Alleles
G
A
P
Allelic Diversity vs. Segmental Duplication
A
A
C
T
C
G
C
C
Repeat Copies (noted by color difference)
Allelic
Copies
Diploid Genome
With a diploid genome, there is significant ambiguity sorting allelic copies from repeat copies
A C C C
Haploid Genome
Repeat Copies (ONLY but noted by color difference)
With a haploid genome, allelic differences are eliminated, and base differences are likely
indicative of repeat copies
Initial Use Of CHM1 Source
• CHORI-17 BAC Library
• CHORI-17 BAC end sequences (n=325,659)
• CHORI-17 multiple enzyme fingerprint map (1560 fpc contigs)
• CHORI-17 BACs
• > 750 have been sequenced
• 664 of them in Genbank as phase 3
SRGAP2 Homology between genes
Shows nearly identical segments between SRGAP2A and SRGAP2 paralogs
Shows homology between SRGAP2B and SRGAP2C
Dennis, et.al. 2012
SRGAP2A
SRGAP2B
SRGAP2C
1q21
1q21 patch alignment to chromosome 1
1q32 1q21 1p21
Williams-Beuren Syndrome region
Slide courtesy of Megan Dennis
Current status of CHM1 resources
• CHORI-17 BAC Library (created from CHM1 cell line)
• CHORI-17 BAC end sequences (n=325,659)
• CHORI-17 multiple enzyme fingerprint map (1560 fpc contigs)
• CHORI-17 BACs (>750 have been sequenced, with 664 of them in
Genbank as phase 3)
• Active cell line
• >100X coverage Illumina 100bp reads
• 300, 500bp, 3kb inserts
• Reference assisted assembly CHM1_1.1
• BioNano genome map
• >60X coverage of PacBio long read data (Both P5 and P6)
• Multiple whole genome assemblies
PacBio CHM1 Assembly Spans GRCh38 Gaps
GRCh38
PacBio CHM1
PacBio CHM1 Assembly Shows Data Not in GRCh38
GRCh38
PacBio CHM1
Second Pass Alignment
Sequencing Plan
Some of the Targeted Regions
CFHR1
SRGAP2/FAM72
BOLA2/CORO1A/SLX1
ARHGAP11
CHRNA7
GTF2IRD2/GTF2I/NCF1
FRMPD2/PTPN20
GPRIN2/PPYR1
DUSP22
HYDIN
IgH
IgK
IgL
TCRA/B
NBPF
DEFB
MUC5a/b/c
LILR
CCL
FCGR1/HIST2H2B
NOTCH2
Genomes Planned
Data Source Origin of Sample Coverage Level Status
CHM1 NA Platinum Assembly QC
CHM13 NA Platinum Assembly QC
NA19240 Yoruban Gold In Assembly
HG00733 Puerto Rican Gold Data Generation
NA12878 European Gold Not Started
HG00514 Han Chinese Gold Not Started
NA19434 Luhya Gold Not Started
CHM13 – 2nd Platinum Genome
• CHM13 – another hydatidiform mole sample
• PacBio data generated
• 60X data was generated using P5 and P6 Chemistry
• Avg read length ~11kb, longer than original CHM1 data
• Assembly Contig N50 ~13Mb
• Illumina coverage has been generated to use for assembly QC, SV
detection, and consensus base error correction
• Plan to use BACs to improve the assembly where needed
• Alignment of Assembly to BioNano Genome map
• Currently ~91% of CHM13 assembly aligns to BioNano map
contigs
CHM13 Mini-Assemblethon
Falcon MHAP
Default 5%
Error
MHAP
Conservative
2.5%
MHAP
Sensitive 5%
MHAP
Sensitive
2.5%
# of
Contigs
2873 15,538 10,430 11,138 13,500
Max
Contig
Length
63,148,543 81,522,549 34,039,925 80,601,297 58,311,553
Contig
N50
12,981,785 13,331,528 5,550,336 19,357,701 11,964,038
Total
Assembly
Size
2,851,367,788 3,061,261,250 2,996,416,935 3,028,933,694 3,086,573,229
Assembly Assessment Methods
• Assemblies will run through NCBI QA pipeline
• Assessed for contiguity, annotation, and concordance with the
finished BAC paths
• Assembly Assembly alignments will be generated between each PB
assembly as well as GRCh38
• BioNano Genome Map
• SV calls generated from comparing the BioNano data to each of the
assemblies
• Generating hybrid scaffolds using BioNano data and assembly data
• Alignment of the Illumina reads back to the each of the
assemblies
• Heterozygous calls are likely indicative of a collapse in the
assembly (for the single haplotype genomes)
BioNano SV Calls Identified a Assembly Problems
Collapse
Expansion
inAssembly
Gap in SequenceAssembly
BioNano Map
CHM13 Hybrid Scaffolds
BioNano Map PacBio Assmbly Hybrid Scaffold
# of Contigs 3593 1590 * 254
Min Contig Length 0.08 Mb 0 0.27 Mb
Median Contig Length 0.61 Mb 0.06 Mb 4.35 Mb
Mean Contig Length 0.78 Mb 1.78 Mb 9.68 Mb
Contig N50 1.02 Mb 13.46 Mb 20.79 Mb
Max Contig Length 5.27 Mb 63.15 Mb 82.83 Mb
Total Contig Length 2812 Mb 2824 Mb 2457.75 Mb
*Number of contigs used in hyrbid scaffolding
57 PacBio contigs and 67 BN contigs were identified as conflicts during this process
CHM13 Hybrid Scaffold
Hybrid Scaffold
PacBio Contigs
BioNano Contigs
NA19240 Initial Assembly Stats
Initial Assembly Stats
# Seq Contigs 3569
Max Contig Length 20,393,869bp
Total Assembly Size 2,745,634,789 bp
N50 6,003,115 bp
N90 848,151 bp
N95 345,457 bp
Future Directions
• Identification of best assembly for on CHM1 and CHM13
• Integration of targeted BACs into the whole genome assembly
• Improvement of the assemblies through scaffolding and making
breaks in the assemblies where needed
• Continue to add diversity to the reference by sequencing
new samples that provide additional diversity to GRCh38
• Additional collaborations with the community to develop
tools to more fully utilize the full reference assembly
(alternate haplotypes)
Acknowledgements
The Genome Institute at Washington
University in St. Louis
Rick Wilson
Bob Fulton
Wes Warren
Karyn Meltz Steinberg
Vince Magrini
Derek Albracht
Milinn Kremitzki
Susan Rock
Debbie Scheer
Aye Wollam
The Finishing and Bioinformatics Teams
at The Genome Institute
University of Washington
Evan Eichler
Megan Dennis
Xander Nuttler
NCBI
Richa Argwala
Valerie Schneider
University of Pittsburgh
School of Medicine (CHM1 cell line)
Urvashi Surti
Personalis
Deanna Church
BioNano Genomics
Pacific Biosciences
UCSF
Pui-Yan Kwok
Yvonne Lai
Chin Lin
Catherine ChuCHORI
Pieter de Jong
Ashg grc workshop2015_tg

Weitere ähnliche Inhalte

Was ist angesagt?

Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCGenome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesGenome Reference Consortium
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)Shaojun Xie
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyGenome Reference Consortium
 

Was ist angesagt? (20)

TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
Grc workshop agbt2015_tg
Grc workshop agbt2015_tgGrc workshop agbt2015_tg
Grc workshop agbt2015_tg
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 
Variant Calling II
Variant Calling IIVariant Calling II
Variant Calling II
 
agbt 2016 workshop lindsay
agbt 2016 workshop lindsayagbt 2016 workshop lindsay
agbt 2016 workshop lindsay
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
Schneider grc workshop_final
Schneider grc workshop_finalSchneider grc workshop_final
Schneider grc workshop_final
 
Ashg2015 grc-pruitt
Ashg2015 grc-pruittAshg2015 grc-pruitt
Ashg2015 grc-pruitt
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 
Ashg2017 workshop schneider
Ashg2017 workshop schneiderAshg2017 workshop schneider
Ashg2017 workshop schneider
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 

Ähnlich wie Ashg grc workshop2015_tg

Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Deanna Church
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Miten Jain
 
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015Torsten Seemann
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Jane Landolin
 
Building a platinum human genome assembly from single haplotype human genomes...
Building a platinum human genome assembly from single haplotype human genomes...Building a platinum human genome assembly from single haplotype human genomes...
Building a platinum human genome assembly from single haplotype human genomes...kmsteinberg
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishingNikolay Vyahhi
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907GenomeInABottle
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethionGenomeInABottle
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsGenome Reference Consortium
 
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...Thermo Fisher Scientific
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...Laura Berry
 

Ähnlich wie Ashg grc workshop2015_tg (20)

Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
 
AGBT 2016 Workshop Magrini
AGBT 2016 Workshop MagriniAGBT 2016 Workshop Magrini
AGBT 2016 Workshop Magrini
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
 
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015
Snippy - Rapid bacterial variant calling - UK - tue 5 may 2015
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
Building a platinum human genome assembly from single haplotype human genomes...
Building a platinum human genome assembly from single haplotype human genomes...Building a platinum human genome assembly from single haplotype human genomes...
Building a platinum human genome assembly from single haplotype human genomes...
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Benchmarking with GIAB 220907
Benchmarking with GIAB 220907Benchmarking with GIAB 220907
Benchmarking with GIAB 220907
 
Church gmod2012 pt2
Church gmod2012 pt2Church gmod2012 pt2
Church gmod2012 pt2
 
New data from giab genomes promethion
New data from giab genomes   promethionNew data from giab genomes   promethion
New data from giab genomes promethion
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
 
Selection analysis using HyPhy
Selection analysis using HyPhySelection analysis using HyPhy
Selection analysis using HyPhy
 
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...
Development of a Multi-Variant Frequency Ladder™ for Next Generation Sequenci...
 
26072016 uc davis_small
26072016 uc davis_small26072016 uc davis_small
26072016 uc davis_small
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
CAP Trapper Technologies and Applications, CAP Analysis of Gene Expression (C...
 

Mehr von Genome Reference Consortium

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?Genome Reference Consortium
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Genome Reference Consortium
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesGenome Reference Consortium
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectGenome Reference Consortium
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amGenome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsGenome Reference Consortium
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGenome Reference Consortium
 

Mehr von Genome Reference Consortium (17)

What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?What's new and what's next for the human reference assembly?
What's new and what's next for the human reference assembly?
 
Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)Advancements in the human genome reference assembly (GRCh38)
Advancements in the human genome reference assembly (GRCh38)
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Why graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 amWhy graph genome storage and updating wakes me up at 4 am
Why graph genome storage and updating wakes me up at 4 am
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final2018 1016 trio_binning_ashg_arhie_final
2018 1016 trio_binning_ashg_arhie_final
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
 

Kürzlich hochgeladen

Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 

Kürzlich hochgeladen (20)

CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 

Ashg grc workshop2015_tg

  • 1. ASHG - GRC Workshop Tina Lindsay ASHG Oct 6, 2015
  • 2. The Human Reference is Not Complete • Reference has been found to not be optimal in some regions • Structural variation makes it difficult to assemble a truly representative genome when using a diploid sample • Some regions were recalcitrant to closure with technology and resources available at the time • Additional sequences are needed to capture the full range of diversity in humans
  • 3. AC074378.4 AC079749.5 AC134921.2 AC147055.2 AC140484.1 AC019173.4 AC093720.2 AC021146.7 NCBI36NC_000004.10 (chr4) Tiling Path Xue Y et al, 2008 TMPRSS11E TMPRSS11E2 GRCh37NC_000004.11 (chr4) Tiling Path AC074378.4 AC079749.5 AC134921.1 AC147055.2 AC093720.2 AC021146.7 TMPRSS11E GRCh37: NT_167250.1 (UGT2B17 alternate locus) AC074378.4 AC140484.1 AC019173.4 AC226496.2 AC021146.7 TMPRSS11E2 UGT2B17 – Conflicting Alleles G A P
  • 4. Allelic Diversity vs. Segmental Duplication A A C T C G C C Repeat Copies (noted by color difference) Allelic Copies Diploid Genome With a diploid genome, there is significant ambiguity sorting allelic copies from repeat copies A C C C Haploid Genome Repeat Copies (ONLY but noted by color difference) With a haploid genome, allelic differences are eliminated, and base differences are likely indicative of repeat copies
  • 5. Initial Use Of CHM1 Source • CHORI-17 BAC Library • CHORI-17 BAC end sequences (n=325,659) • CHORI-17 multiple enzyme fingerprint map (1560 fpc contigs) • CHORI-17 BACs • > 750 have been sequenced • 664 of them in Genbank as phase 3
  • 6. SRGAP2 Homology between genes Shows nearly identical segments between SRGAP2A and SRGAP2 paralogs Shows homology between SRGAP2B and SRGAP2C Dennis, et.al. 2012 SRGAP2A SRGAP2B SRGAP2C
  • 7. 1q21 1q21 patch alignment to chromosome 1 1q32 1q21 1p21
  • 8. Williams-Beuren Syndrome region Slide courtesy of Megan Dennis
  • 9. Current status of CHM1 resources • CHORI-17 BAC Library (created from CHM1 cell line) • CHORI-17 BAC end sequences (n=325,659) • CHORI-17 multiple enzyme fingerprint map (1560 fpc contigs) • CHORI-17 BACs (>750 have been sequenced, with 664 of them in Genbank as phase 3) • Active cell line • >100X coverage Illumina 100bp reads • 300, 500bp, 3kb inserts • Reference assisted assembly CHM1_1.1 • BioNano genome map • >60X coverage of PacBio long read data (Both P5 and P6) • Multiple whole genome assemblies
  • 10. PacBio CHM1 Assembly Spans GRCh38 Gaps GRCh38 PacBio CHM1
  • 11. PacBio CHM1 Assembly Shows Data Not in GRCh38 GRCh38 PacBio CHM1 Second Pass Alignment
  • 13. Some of the Targeted Regions CFHR1 SRGAP2/FAM72 BOLA2/CORO1A/SLX1 ARHGAP11 CHRNA7 GTF2IRD2/GTF2I/NCF1 FRMPD2/PTPN20 GPRIN2/PPYR1 DUSP22 HYDIN IgH IgK IgL TCRA/B NBPF DEFB MUC5a/b/c LILR CCL FCGR1/HIST2H2B NOTCH2
  • 14. Genomes Planned Data Source Origin of Sample Coverage Level Status CHM1 NA Platinum Assembly QC CHM13 NA Platinum Assembly QC NA19240 Yoruban Gold In Assembly HG00733 Puerto Rican Gold Data Generation NA12878 European Gold Not Started HG00514 Han Chinese Gold Not Started NA19434 Luhya Gold Not Started
  • 15. CHM13 – 2nd Platinum Genome • CHM13 – another hydatidiform mole sample • PacBio data generated • 60X data was generated using P5 and P6 Chemistry • Avg read length ~11kb, longer than original CHM1 data • Assembly Contig N50 ~13Mb • Illumina coverage has been generated to use for assembly QC, SV detection, and consensus base error correction • Plan to use BACs to improve the assembly where needed • Alignment of Assembly to BioNano Genome map • Currently ~91% of CHM13 assembly aligns to BioNano map contigs
  • 16. CHM13 Mini-Assemblethon Falcon MHAP Default 5% Error MHAP Conservative 2.5% MHAP Sensitive 5% MHAP Sensitive 2.5% # of Contigs 2873 15,538 10,430 11,138 13,500 Max Contig Length 63,148,543 81,522,549 34,039,925 80,601,297 58,311,553 Contig N50 12,981,785 13,331,528 5,550,336 19,357,701 11,964,038 Total Assembly Size 2,851,367,788 3,061,261,250 2,996,416,935 3,028,933,694 3,086,573,229
  • 17. Assembly Assessment Methods • Assemblies will run through NCBI QA pipeline • Assessed for contiguity, annotation, and concordance with the finished BAC paths • Assembly Assembly alignments will be generated between each PB assembly as well as GRCh38 • BioNano Genome Map • SV calls generated from comparing the BioNano data to each of the assemblies • Generating hybrid scaffolds using BioNano data and assembly data • Alignment of the Illumina reads back to the each of the assemblies • Heterozygous calls are likely indicative of a collapse in the assembly (for the single haplotype genomes)
  • 18. BioNano SV Calls Identified a Assembly Problems Collapse Expansion inAssembly Gap in SequenceAssembly BioNano Map
  • 19. CHM13 Hybrid Scaffolds BioNano Map PacBio Assmbly Hybrid Scaffold # of Contigs 3593 1590 * 254 Min Contig Length 0.08 Mb 0 0.27 Mb Median Contig Length 0.61 Mb 0.06 Mb 4.35 Mb Mean Contig Length 0.78 Mb 1.78 Mb 9.68 Mb Contig N50 1.02 Mb 13.46 Mb 20.79 Mb Max Contig Length 5.27 Mb 63.15 Mb 82.83 Mb Total Contig Length 2812 Mb 2824 Mb 2457.75 Mb *Number of contigs used in hyrbid scaffolding 57 PacBio contigs and 67 BN contigs were identified as conflicts during this process
  • 20. CHM13 Hybrid Scaffold Hybrid Scaffold PacBio Contigs BioNano Contigs
  • 21. NA19240 Initial Assembly Stats Initial Assembly Stats # Seq Contigs 3569 Max Contig Length 20,393,869bp Total Assembly Size 2,745,634,789 bp N50 6,003,115 bp N90 848,151 bp N95 345,457 bp
  • 22. Future Directions • Identification of best assembly for on CHM1 and CHM13 • Integration of targeted BACs into the whole genome assembly • Improvement of the assemblies through scaffolding and making breaks in the assemblies where needed • Continue to add diversity to the reference by sequencing new samples that provide additional diversity to GRCh38 • Additional collaborations with the community to develop tools to more fully utilize the full reference assembly (alternate haplotypes)
  • 23. Acknowledgements The Genome Institute at Washington University in St. Louis Rick Wilson Bob Fulton Wes Warren Karyn Meltz Steinberg Vince Magrini Derek Albracht Milinn Kremitzki Susan Rock Debbie Scheer Aye Wollam The Finishing and Bioinformatics Teams at The Genome Institute University of Washington Evan Eichler Megan Dennis Xander Nuttler NCBI Richa Argwala Valerie Schneider University of Pittsburgh School of Medicine (CHM1 cell line) Urvashi Surti Personalis Deanna Church BioNano Genomics Pacific Biosciences UCSF Pui-Yan Kwok Yvonne Lai Chin Lin Catherine ChuCHORI Pieter de Jong