SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
gEVAL – A Genome Evaluation Browser for Improving Genome Assemblies! 
William Chow, Kim Brugger, Britt Kilian, James Torrance, Eduard Zuiderwijk, and Kerstin Howe! 
Wellcome Trust Sanger Institute, Cambridge, UK.! 
http://geval.sanger.ac.uk! 
geval-help@sanger.ac.uk! 
gEVAL Punchlists and Issue Navigation! 
Automated lists created to facilitate identification of and navigation to issues or regions of 
interest. In browser menus also help to jump between issues.! 
! 
Optical Maps! 
Optical map data are ordered restriction maps from single stained molecules of DNA that can be 
aligned against assemblies. gEVAL hosts some of this data for human and mouse and aids in 
identifying genomic regions that requires attention, such as rearrangements or mis-representation 
of sequence and haplotypes. ! 
! 
Introduction! 
The web-accessible gEVAL browser (http://geval.sanger.ac.uk) allows the evaluation of 
genome assemblies through its tools and pre-computed analyses. The strength of this browser 
is the ability to navigate an up to date assembly and identify problematic regions and assisting in 
strategizing potential solutions for these issues. This facilitates the improvement of overall 
assemblies to a “gold” standard for release as reference genomes.! 
Mapped multiple times 
Wrong direction (<<, <>, >>) 
Wrong distance from partner 
Integration of GRC Review/Status Update System! 
A! s part of the GRC curation process, 
regions of interest that are to be 
evaluated are tagged and tracked via 
the GRC review ticketing system. ! 
! 
Both resolved and unresolved tickets are 
visible for viewing as a track on the 
browser or as a dedicated punchlist. ! 
! 
A summary of the features in the region 
associated with the ticket is also available 
(right insert).! 
Visual Representation of Current Assembly State! 
Our build cycle is frequent, and thus can represent a current snapshot of the 
assembly. As we are part of the GRC, we also have first access to major GRC 
assembly releases.! 
Component 
in 
sequencing 
Pipeline. 
Phase 
1 
unfinished 
component. 
Phase 
2/3 
finished 
component. 
Comparative Genomics! 
gEVAL includes comparative analyses of different assembly builds for each species. This 
helps in identifying missing sequences, reference assembly errors and haplotypic variation.! 
! 
A gap separates two clone 
components in a zebrafish 
bulid. Investigating the 
alignments against two whole 
genome shotgun (wgs) 
assemblies reveal size of gap 
and missing sequence.! 
A region of the wgs is used to 
cover the gap in a later build. 
(bonus: a clone is also in 
pipeline, grey box above). ! 
The clone component AL596089 
contains a deletion and is 
highlighted by the 3 cell line optical 
map analysis (right). This would not 
have been captured because the 
clone overlaps do not extend far 
enough to show this. An issue that 
is tagged and reported in GRC 
ticket: HG-1482.! 
! 
Optical Map data provided by the D. 
Schwartz Lab (UW Madison).! 
Popup menus on tracks to quickly 
help navigate between previous/ 
next overlap between components 
along a chr (below).! 
An example overview of punchlists available. Punchlists can be 
tailored for different projects, on request (above).! 
Components potentially placed on the wrong chr using marker 
evidence listed per chr (below).! 
Current Species Available! 
Identify Problematic/Incomplete Transcript Mappings! 
GREEN – 98% cutoff coverage ! 
ORANGE – Incomplete or problematic transcript! 
! 
• This example shows how a region of 2 
clones (dark/light blue boxes on contig 
track) have incorrect orientation.! 
• The overlapping gene ryr1b, therefore 
looked to be split on opposite strands! 
• The incorrect orientation of 2 gap 
spanning fosmids confirmed the 
assertion that CU138549 was in the 
wrong orientation.! 
! 
The up to date path returns the correct 
gene structure and clone end mapping.! 
before! 
after! 
Examine Large Region of Interest! 
View region windows of up to 2Mb, allowing for greater vantage of possible 
problematic areas. The Region overview page provides a less detailed snapshot 
of larger windows up to the entire chromosome or top level component.! 
! 
! 
Region overview can show, for example, the state of the assembly and how much are unfinished, 
finished or sequence that is in production. Above is a snapshot of a region just under 10Mb and 
the clones in the path. Status of clones can be quickly scanned and regions prioritized. ! 
Clone End Library Mappings! 
! 
Mapped 1 time 
Spanning partner in the vicinity 
Clone end mappings in gEVAL are unique 
due to how they are displayed, facilitating 
the ease of identifying concurrent clones 
or inconsistencies relating to a potential 
problem with the assembly. Clones can be 
picked to close gap regions or to span 
regions of interest for further interrogation.! 
! 
before! 
after! 
The above example illustrates using end placements to pick clones to cover gaps. In the before 
image, there is a gap with a BAC clone spanning the gapped region according to their end 
placements (orange). In the subsequent assembly (after image above) with the clone 
sequenced, the unfinished clone places well in the region, as illustrated by the green clone 
overlaps.! 
Human! 
GRCh38, GRCh37pX (latest patch), 
NCBI36, CHM1_1.1, NA12878, HuREF, 
YH1/2.0. ! 
Zebrafish! 
Zv9, WGS28, WGS29, WGS31, ! 
z.2013.12.06, z.2014.03.14.! 
! 
Mouse! 
GRCm38, GRCm38pX (latest patch), 
GRCm37B/C, NCBIm37, wgs_c57bl6j, 
wgs_celera, MGSCv3, m.2013.03.15.! 
Helminth! 
Echinococcus multilocularis! 
Schistosoma mansoni ! 
Stronglyoides ratti! 
Genome Reference Consortium! 
The Genome Reference Consortium (GRC) is a partnership between the Sanger Institute, 
NCBI, EBI and the Genome Institute at Wash U tasked with improving and providing accurate 
reference genomes. This includes releasing the reference assemblies of human, mouse and 
zebrafish. ! 
Pig! 
Sscrofa10.2! 
The red arrows highlights the 
incorrect orientation of these 
ryr1b gene split fosmid ends. ! 
on opposite 
strands! 
Clone end placements reveal sequence that can be placed in the gap 
region. Assembly reveals newly sequenced clone in path.!

Weitere ähnliche Inhalte

Was ist angesagt?

HCS for brain disorders / HCS Pharma at B4B mars 2018
HCS for brain disorders / HCS Pharma at B4B mars 2018HCS for brain disorders / HCS Pharma at B4B mars 2018
HCS for brain disorders / HCS Pharma at B4B mars 2018HCS Pharma
 
Introduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsIntroduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsAndrea Telatin
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
02.databases slides
02.databases slides02.databases slides
02.databases slidesItsme148
 
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...VHIR Vall d’Hebron Institut de Recerca
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysisDr. Naveen Gaurav srivastava
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 

Was ist angesagt? (7)

HCS for brain disorders / HCS Pharma at B4B mars 2018
HCS for brain disorders / HCS Pharma at B4B mars 2018HCS for brain disorders / HCS Pharma at B4B mars 2018
HCS for brain disorders / HCS Pharma at B4B mars 2018
 
Introduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsIntroduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR Genomics
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
02.databases slides
02.databases slides02.databases slides
02.databases slides
 
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 

Ähnlich wie gEVAL - A Genome Evaluation Browser for Improving Genome Assemblies (SFAF 2014 Poster)

Apollo Workshop AGS2017 Editing functionality
Apollo Workshop AGS2017 Editing functionalityApollo Workshop AGS2017 Editing functionality
Apollo Workshop AGS2017 Editing functionalityMonica Munoz-Torres
 
SyMAP Master's Thesis Presentation
SyMAP Master's Thesis PresentationSyMAP Master's Thesis Presentation
SyMAP Master's Thesis Presentationaustinps
 
Genome Assembly copy
Genome Assembly   copyGenome Assembly   copy
Genome Assembly copyPradeep Kumar
 
Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009lyonja
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
Visualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsVisualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsAnn Loraine
 
CNV and aneuploidy detection by Ion semiconductor sequencing
CNV and aneuploidy detection by Ion semiconductor sequencingCNV and aneuploidy detection by Ion semiconductor sequencing
CNV and aneuploidy detection by Ion semiconductor sequencingThermo Fisher Scientific
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangementPinky Vincent
 
Exploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingExploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingxrbiotech
 
Climbing Mt. Metagenome
Climbing Mt. MetagenomeClimbing Mt. Metagenome
Climbing Mt. Metagenomec.titus.brown
 
F Giordano ScanPAV Analysis Pipeline
F Giordano ScanPAV Analysis PipelineF Giordano ScanPAV Analysis Pipeline
F Giordano ScanPAV Analysis PipelineFrancesca Giordano
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_PresentationToyin23
 
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIOR
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIORANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIOR
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIORijsptm
 

Ähnlich wie gEVAL - A Genome Evaluation Browser for Improving Genome Assemblies (SFAF 2014 Poster) (20)

Shotgun and clone contig method
Shotgun and clone contig methodShotgun and clone contig method
Shotgun and clone contig method
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Apollo Workshop AGS2017 Editing functionality
Apollo Workshop AGS2017 Editing functionalityApollo Workshop AGS2017 Editing functionality
Apollo Workshop AGS2017 Editing functionality
 
SyMAP Master's Thesis Presentation
SyMAP Master's Thesis PresentationSyMAP Master's Thesis Presentation
SyMAP Master's Thesis Presentation
 
Genome Assembly copy
Genome Assembly   copyGenome Assembly   copy
Genome Assembly copy
 
Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
Visualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsVisualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotations
 
CNV and aneuploidy detection by Ion semiconductor sequencing
CNV and aneuploidy detection by Ion semiconductor sequencingCNV and aneuploidy detection by Ion semiconductor sequencing
CNV and aneuploidy detection by Ion semiconductor sequencing
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
3302 3305
3302 33053302 3305
3302 3305
 
Genome rearrangement
Genome rearrangementGenome rearrangement
Genome rearrangement
 
Exploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasingExploiting tertiary structure through local folds for crystallographic phasing
Exploiting tertiary structure through local folds for crystallographic phasing
 
Climbing Mt. Metagenome
Climbing Mt. MetagenomeClimbing Mt. Metagenome
Climbing Mt. Metagenome
 
Genome comparision
Genome comparisionGenome comparision
Genome comparision
 
F Giordano ScanPAV Analysis Pipeline
F Giordano ScanPAV Analysis PipelineF Giordano ScanPAV Analysis Pipeline
F Giordano ScanPAV Analysis Pipeline
 
Gene prediction strategies
Gene prediction strategies Gene prediction strategies
Gene prediction strategies
 
Bioinformatics t8-go-hmm v2014
Bioinformatics t8-go-hmm v2014Bioinformatics t8-go-hmm v2014
Bioinformatics t8-go-hmm v2014
 
RNA-Seq_Presentation
RNA-Seq_PresentationRNA-Seq_Presentation
RNA-Seq_Presentation
 
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIOR
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIORANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIOR
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA CHAOTIC RULES BEHAVIOR
 

Kürzlich hochgeladen

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 

Kürzlich hochgeladen (20)

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 

gEVAL - A Genome Evaluation Browser for Improving Genome Assemblies (SFAF 2014 Poster)

  • 1. gEVAL – A Genome Evaluation Browser for Improving Genome Assemblies! William Chow, Kim Brugger, Britt Kilian, James Torrance, Eduard Zuiderwijk, and Kerstin Howe! Wellcome Trust Sanger Institute, Cambridge, UK.! http://geval.sanger.ac.uk! geval-help@sanger.ac.uk! gEVAL Punchlists and Issue Navigation! Automated lists created to facilitate identification of and navigation to issues or regions of interest. In browser menus also help to jump between issues.! ! Optical Maps! Optical map data are ordered restriction maps from single stained molecules of DNA that can be aligned against assemblies. gEVAL hosts some of this data for human and mouse and aids in identifying genomic regions that requires attention, such as rearrangements or mis-representation of sequence and haplotypes. ! ! Introduction! The web-accessible gEVAL browser (http://geval.sanger.ac.uk) allows the evaluation of genome assemblies through its tools and pre-computed analyses. The strength of this browser is the ability to navigate an up to date assembly and identify problematic regions and assisting in strategizing potential solutions for these issues. This facilitates the improvement of overall assemblies to a “gold” standard for release as reference genomes.! Mapped multiple times Wrong direction (<<, <>, >>) Wrong distance from partner Integration of GRC Review/Status Update System! A! s part of the GRC curation process, regions of interest that are to be evaluated are tagged and tracked via the GRC review ticketing system. ! ! Both resolved and unresolved tickets are visible for viewing as a track on the browser or as a dedicated punchlist. ! ! A summary of the features in the region associated with the ticket is also available (right insert).! Visual Representation of Current Assembly State! Our build cycle is frequent, and thus can represent a current snapshot of the assembly. As we are part of the GRC, we also have first access to major GRC assembly releases.! Component in sequencing Pipeline. Phase 1 unfinished component. Phase 2/3 finished component. Comparative Genomics! gEVAL includes comparative analyses of different assembly builds for each species. This helps in identifying missing sequences, reference assembly errors and haplotypic variation.! ! A gap separates two clone components in a zebrafish bulid. Investigating the alignments against two whole genome shotgun (wgs) assemblies reveal size of gap and missing sequence.! A region of the wgs is used to cover the gap in a later build. (bonus: a clone is also in pipeline, grey box above). ! The clone component AL596089 contains a deletion and is highlighted by the 3 cell line optical map analysis (right). This would not have been captured because the clone overlaps do not extend far enough to show this. An issue that is tagged and reported in GRC ticket: HG-1482.! ! Optical Map data provided by the D. Schwartz Lab (UW Madison).! Popup menus on tracks to quickly help navigate between previous/ next overlap between components along a chr (below).! An example overview of punchlists available. Punchlists can be tailored for different projects, on request (above).! Components potentially placed on the wrong chr using marker evidence listed per chr (below).! Current Species Available! Identify Problematic/Incomplete Transcript Mappings! GREEN – 98% cutoff coverage ! ORANGE – Incomplete or problematic transcript! ! • This example shows how a region of 2 clones (dark/light blue boxes on contig track) have incorrect orientation.! • The overlapping gene ryr1b, therefore looked to be split on opposite strands! • The incorrect orientation of 2 gap spanning fosmids confirmed the assertion that CU138549 was in the wrong orientation.! ! The up to date path returns the correct gene structure and clone end mapping.! before! after! Examine Large Region of Interest! View region windows of up to 2Mb, allowing for greater vantage of possible problematic areas. The Region overview page provides a less detailed snapshot of larger windows up to the entire chromosome or top level component.! ! ! Region overview can show, for example, the state of the assembly and how much are unfinished, finished or sequence that is in production. Above is a snapshot of a region just under 10Mb and the clones in the path. Status of clones can be quickly scanned and regions prioritized. ! Clone End Library Mappings! ! Mapped 1 time Spanning partner in the vicinity Clone end mappings in gEVAL are unique due to how they are displayed, facilitating the ease of identifying concurrent clones or inconsistencies relating to a potential problem with the assembly. Clones can be picked to close gap regions or to span regions of interest for further interrogation.! ! before! after! The above example illustrates using end placements to pick clones to cover gaps. In the before image, there is a gap with a BAC clone spanning the gapped region according to their end placements (orange). In the subsequent assembly (after image above) with the clone sequenced, the unfinished clone places well in the region, as illustrated by the green clone overlaps.! Human! GRCh38, GRCh37pX (latest patch), NCBI36, CHM1_1.1, NA12878, HuREF, YH1/2.0. ! Zebrafish! Zv9, WGS28, WGS29, WGS31, ! z.2013.12.06, z.2014.03.14.! ! Mouse! GRCm38, GRCm38pX (latest patch), GRCm37B/C, NCBIm37, wgs_c57bl6j, wgs_celera, MGSCv3, m.2013.03.15.! Helminth! Echinococcus multilocularis! Schistosoma mansoni ! Stronglyoides ratti! Genome Reference Consortium! The Genome Reference Consortium (GRC) is a partnership between the Sanger Institute, NCBI, EBI and the Genome Institute at Wash U tasked with improving and providing accurate reference genomes. This includes releasing the reference assemblies of human, mouse and zebrafish. ! Pig! Sscrofa10.2! The red arrows highlights the incorrect orientation of these ryr1b gene split fosmid ends. ! on opposite strands! Clone end placements reveal sequence that can be placed in the gap region. Assembly reveals newly sequenced clone in path.!