SlideShare ist ein Scribd-Unternehmen logo
1 von 50
British Columbia Cancer Agency Genome Sciences Centre Vancouver . British Columbia . Canada Complementing Computation with Visualization in Genomics March 11, 2010 EBI Interfaces Interest Forum Cydney Nielsen
Discovery path Biological Sample Genomic Data Scientific Insight
Discovery path Biological Sample Genomic Data Scientific Insight
Components of Data Analysis Automation Analysis Genomic Data Scientific Insight Human Judgment
Outline Genome Assembly Visualization ABySS-Explorer Complement to genome browsing  Using clustering and interactive data exploration
Outline Genome Assembly Visualization ABySS-Explorer Complement to genome browsing  Using clustering and interactive data exploration
Genome Sequencing cell population extracted DNA Shotgun approach sheared DNA sequencing reads AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
ABySS – Assembly ByShort Sequences Simpson et al. Genome Res 2009 Sequencing read set (read length = 7 nt): GGACATC GGACAGA Corresponding de Bruijn graph (k = 5 nt):
ABySS – Assembly ByShort Sequences Simpson et al. Genome Res 2009 Sequencing read set (read length = 7 nt): GGACATC GGACAGA Corresponding de Bruijn graph (k = 5 nt): ABySS merges unambiguously connected vertices to form contigs
Assembly Ambiguities True genome sequence GGATTGAAAAAAAAAAAAAAAAGTAGCACGAATATACATAGAAAAAAAAAAAAAAAAATTACG
Assembly Ambiguities True genome sequence GGATTGAAAAAAAAAAAAAAAAGTAGCACGAATATACATAGAAAAAAAAAAAAAAAAATTACG Assembled sequence  de Bruijn graph representation
Starting Point Shaun Jackman
Example of existing tools: Consed
Example of existing tools: Consed
Properties of DNA
Capture sequence strand AAAAAT 2+ 1+
Capture sequence strand AAAAAT 2+ 1+ TTTTTA 2- 1-
Capture sequence strand AAAAAT 1+ 2+ TTTTTA
Capture sequence strand AAAAAT 1- 2- TTTTTA
Capture sequence length one oscillation = 100 nt
Genome Sequencing cell population extracted DNA read pair information read sheared DNA dsDNA fragment (known size) sequencing reads (typically produce millions) AGCGGATTGCATGACAGT read GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
Capture read pair information After building the initial single-end (SE) contigs from k-mer sequences, ABySS uses paired-end reads to resolve ambiguities.
Capture read pair information Paired end read information is used the construct paired end (PE) contigs … 13+  44-  46+  4+  79+  70+ … blue gradient = paired end contig orange = selected single end contig
ABySS-Explorer ,[object Object]
 contig adjacency information
 contig strand
 contig length
 paired-end relationships
 paired-end contigs
 Implemented using the Java Universal Network/Graph Framework (JUNG)
 Applied the Kamada-Kawai layout algorithm (JUNG implementation)
 Use ABySS files as input (version 1.1.0 and higher),[object Object]
http://www.bcgsc.ca/platform/bioinfo/software/abyss-explorer
Part 1: Conclusions and Future Work ,[object Object]
 This representation is particularly powerful for revealing high-level genome assembly structure, not readily viewable in any other interactive tool
 Future work includes:
 support for other assembly algorithm outputs
enable flexible annotation display
 integrate with existing assembly editing tools,[object Object]
Genome Sequencing cell population extracted DNA sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
Genome Sequencing cell population extracted DNA sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
Genome Sequencing cell population Chromatin Immunoprecipitationand Sequencing  (ChIP-Seq) extracted DNA selection sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA GTACAGCCTGACAGAAGC CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG TTCAGAATGGTACAGCAG
Align sequences to the genome CCGAGTACAGCCTGACAGA GCATGACAGTCCGAGTAC TTGCATGACAGTCCGAGT AGCGGATTGCATGACAGT AGCGGATTGCATGACAGT AGCGGATTGCATGACAGT Reference Genome AGCGGATTGCATGACAGTCCGAGTACAGCCTGACAGA Read coverage Genomic coordinate
Genome browser can reveal local patterns H3K4me3 H3K36me3 H3K27me3 H3K9me3 H3K9Ac MRE
Difficult to get global overview
Focus on regions of interest 1. For example, transcriptional start sites (TSS +/- 3000 nt) H3K4me3 H3K9Ac H3K4me1 H3K36me3 MeDIP MRE 2. Extract data matrices Normalization for bin i, sample h: 3. Cluster matrices (k-means clustering with Euclidean distance)

Weitere ähnliche Inhalte

Andere mochten auch

Usability Testing is Easy!
Usability Testing is Easy!Usability Testing is Easy!
Usability Testing is Easy!Francis Rowland
 
PES Vitamin Series - Module 5 - How to Select a Premium Multivitamin
PES Vitamin Series - Module 5 -  How to Select a Premium MultivitaminPES Vitamin Series - Module 5 -  How to Select a Premium Multivitamin
PES Vitamin Series - Module 5 - How to Select a Premium MultivitaminPerformance Education Systems
 
Usability Testing is Easy! (redux)
Usability Testing is Easy! (redux)Usability Testing is Easy! (redux)
Usability Testing is Easy! (redux)Francis Rowland
 

Andere mochten auch (6)

Usability Testing is Easy!
Usability Testing is Easy!Usability Testing is Easy!
Usability Testing is Easy!
 
Manoocher's portfolio
Manoocher's portfolioManoocher's portfolio
Manoocher's portfolio
 
Ensembl Redesign
Ensembl RedesignEnsembl Redesign
Ensembl Redesign
 
PES Vitamin Series - Module 5 - How to Select a Premium Multivitamin
PES Vitamin Series - Module 5 -  How to Select a Premium MultivitaminPES Vitamin Series - Module 5 -  How to Select a Premium Multivitamin
PES Vitamin Series - Module 5 - How to Select a Premium Multivitamin
 
Usability Testing is Easy! (redux)
Usability Testing is Easy! (redux)Usability Testing is Easy! (redux)
Usability Testing is Easy! (redux)
 
Cocoa for Scientists
Cocoa for ScientistsCocoa for Scientists
Cocoa for Scientists
 

Ähnlich wie Complementing Computation with Visualization in Genomics

Genome Assembly copy
Genome Assembly   copyGenome Assembly   copy
Genome Assembly copyPradeep Kumar
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Dominic Suciu
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceRobert Grossman
 
Kulakova sbb2014
Kulakova sbb2014Kulakova sbb2014
Kulakova sbb2014Ek_Kul
 
Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...William Chow
 
Report-de Bruijn Graph
Report-de Bruijn GraphReport-de Bruijn Graph
Report-de Bruijn GraphAshwani kumar
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012Dan Gaston
 
Accelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsAccelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsPriscill Orue Esquivel
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisSANJANA PANDEY
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
Computational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisComputational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisIan Simpson
 
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSExploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSGolden Helix Inc
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Monica Munoz-Torres
 
Multi-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/BioconductorMulti-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/BioconductorLevi Waldron
 

Ähnlich wie Complementing Computation with Visualization in Genomics (20)

Genome Assembly copy
Genome Assembly   copyGenome Assembly   copy
Genome Assembly copy
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data Science
 
Kulakova sbb2014
Kulakova sbb2014Kulakova sbb2014
Kulakova sbb2014
 
Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...Integration of single molecule, genome mapping data in a web-based genome bro...
Integration of single molecule, genome mapping data in a web-based genome bro...
 
Report-de Bruijn Graph
Report-de Bruijn GraphReport-de Bruijn Graph
Report-de Bruijn Graph
 
sb400161v
sb400161vsb400161v
sb400161v
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
Accelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsAccelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methods
 
Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
Computational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisComputational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesis
 
NCBI
NCBINCBI
NCBI
 
Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
 
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSExploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
 
Cytoscape Talk 2010
Cytoscape Talk 2010Cytoscape Talk 2010
Cytoscape Talk 2010
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing
 
Multi-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/BioconductorMulti-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/Bioconductor
 

Mehr von Francis Rowland

Visual note-taking: listening, learning, remembering
Visual note-taking: listening, learning, rememberingVisual note-taking: listening, learning, remembering
Visual note-taking: listening, learning, rememberingFrancis Rowland
 
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015Francis Rowland
 
Les super pouvoirs du sketching
Les super pouvoirs du sketchingLes super pouvoirs du sketching
Les super pouvoirs du sketchingFrancis Rowland
 
Useful questions to ask when designing data visualisations
Useful questions to ask when designing data visualisationsUseful questions to ask when designing data visualisations
Useful questions to ask when designing data visualisationsFrancis Rowland
 
Jeux d'Innovation (FLUPA UX Day 2013)
Jeux d'Innovation (FLUPA UX Day 2013)Jeux d'Innovation (FLUPA UX Day 2013)
Jeux d'Innovation (FLUPA UX Day 2013)Francis Rowland
 
What the heck are sketchnotes?
What the heck are sketchnotes?What the heck are sketchnotes?
What the heck are sketchnotes?Francis Rowland
 
VIZBI 2013 - UX design tutorial
VIZBI 2013 - UX design tutorialVIZBI 2013 - UX design tutorial
VIZBI 2013 - UX design tutorialFrancis Rowland
 
User research: the gentle art of not asking users what they want
User research: the gentle art of not asking users what they wantUser research: the gentle art of not asking users what they want
User research: the gentle art of not asking users what they wantFrancis Rowland
 
Why usability problems go unfixed - UX Bristol 2012
Why usability problems go unfixed - UX Bristol 2012Why usability problems go unfixed - UX Bristol 2012
Why usability problems go unfixed - UX Bristol 2012Francis Rowland
 
The user experience of EGA data access
The user experience of EGA data accessThe user experience of EGA data access
The user experience of EGA data accessFrancis Rowland
 
Speed sketching UX Cambridge 2011
Speed sketching UX Cambridge 2011Speed sketching UX Cambridge 2011
Speed sketching UX Cambridge 2011Francis Rowland
 
Reactome: Usability testing - is it useful?
Reactome: Usability testing - is it useful? Reactome: Usability testing - is it useful?
Reactome: Usability testing - is it useful? Francis Rowland
 
Caroline Jarrett: Forms and their Users
Caroline Jarrett: Forms and their UsersCaroline Jarrett: Forms and their Users
Caroline Jarrett: Forms and their UsersFrancis Rowland
 
Gene Expression Atlas user interface
Gene Expression Atlas user interfaceGene Expression Atlas user interface
Gene Expression Atlas user interfaceFrancis Rowland
 

Mehr von Francis Rowland (20)

Sabotage
Sabotage Sabotage
Sabotage
 
Visual note-taking: listening, learning, remembering
Visual note-taking: listening, learning, rememberingVisual note-taking: listening, learning, remembering
Visual note-taking: listening, learning, remembering
 
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015
A UX Journey into the World of Early Drug Discovery - UX Cambridge 2015
 
Les super pouvoirs du sketching
Les super pouvoirs du sketchingLes super pouvoirs du sketching
Les super pouvoirs du sketching
 
Le Design Studio
Le Design StudioLe Design Studio
Le Design Studio
 
Useful questions to ask when designing data visualisations
Useful questions to ask when designing data visualisationsUseful questions to ask when designing data visualisations
Useful questions to ask when designing data visualisations
 
Jeux d'Innovation (FLUPA UX Day 2013)
Jeux d'Innovation (FLUPA UX Day 2013)Jeux d'Innovation (FLUPA UX Day 2013)
Jeux d'Innovation (FLUPA UX Day 2013)
 
What the heck are sketchnotes?
What the heck are sketchnotes?What the heck are sketchnotes?
What the heck are sketchnotes?
 
VIZBI 2013 - UX design tutorial
VIZBI 2013 - UX design tutorialVIZBI 2013 - UX design tutorial
VIZBI 2013 - UX design tutorial
 
User research: the gentle art of not asking users what they want
User research: the gentle art of not asking users what they wantUser research: the gentle art of not asking users what they want
User research: the gentle art of not asking users what they want
 
Design for Society
Design for SocietyDesign for Society
Design for Society
 
Why usability problems go unfixed - UX Bristol 2012
Why usability problems go unfixed - UX Bristol 2012Why usability problems go unfixed - UX Bristol 2012
Why usability problems go unfixed - UX Bristol 2012
 
Vizbi 2012 Takeaway
Vizbi 2012 TakeawayVizbi 2012 Takeaway
Vizbi 2012 Takeaway
 
The user experience of EGA data access
The user experience of EGA data accessThe user experience of EGA data access
The user experience of EGA data access
 
Speed sketching UX Cambridge 2011
Speed sketching UX Cambridge 2011Speed sketching UX Cambridge 2011
Speed sketching UX Cambridge 2011
 
Drupal at the EBI
Drupal at the EBIDrupal at the EBI
Drupal at the EBI
 
Reactome: Usability testing - is it useful?
Reactome: Usability testing - is it useful? Reactome: Usability testing - is it useful?
Reactome: Usability testing - is it useful?
 
Caroline Jarrett: Forms and their Users
Caroline Jarrett: Forms and their UsersCaroline Jarrett: Forms and their Users
Caroline Jarrett: Forms and their Users
 
Design Prototyping
Design PrototypingDesign Prototyping
Design Prototyping
 
Gene Expression Atlas user interface
Gene Expression Atlas user interfaceGene Expression Atlas user interface
Gene Expression Atlas user interface
 

Kürzlich hochgeladen

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Kürzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Complementing Computation with Visualization in Genomics

  • 1. British Columbia Cancer Agency Genome Sciences Centre Vancouver . British Columbia . Canada Complementing Computation with Visualization in Genomics March 11, 2010 EBI Interfaces Interest Forum Cydney Nielsen
  • 2. Discovery path Biological Sample Genomic Data Scientific Insight
  • 3. Discovery path Biological Sample Genomic Data Scientific Insight
  • 4. Components of Data Analysis Automation Analysis Genomic Data Scientific Insight Human Judgment
  • 5. Outline Genome Assembly Visualization ABySS-Explorer Complement to genome browsing Using clustering and interactive data exploration
  • 6. Outline Genome Assembly Visualization ABySS-Explorer Complement to genome browsing Using clustering and interactive data exploration
  • 7. Genome Sequencing cell population extracted DNA Shotgun approach sheared DNA sequencing reads AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
  • 8. ABySS – Assembly ByShort Sequences Simpson et al. Genome Res 2009 Sequencing read set (read length = 7 nt): GGACATC GGACAGA Corresponding de Bruijn graph (k = 5 nt):
  • 9. ABySS – Assembly ByShort Sequences Simpson et al. Genome Res 2009 Sequencing read set (read length = 7 nt): GGACATC GGACAGA Corresponding de Bruijn graph (k = 5 nt): ABySS merges unambiguously connected vertices to form contigs
  • 10. Assembly Ambiguities True genome sequence GGATTGAAAAAAAAAAAAAAAAGTAGCACGAATATACATAGAAAAAAAAAAAAAAAAATTACG
  • 11. Assembly Ambiguities True genome sequence GGATTGAAAAAAAAAAAAAAAAGTAGCACGAATATACATAGAAAAAAAAAAAAAAAAATTACG Assembled sequence de Bruijn graph representation
  • 13. Example of existing tools: Consed
  • 14. Example of existing tools: Consed
  • 15.
  • 16.
  • 17.
  • 19. Capture sequence strand AAAAAT 2+ 1+
  • 20. Capture sequence strand AAAAAT 2+ 1+ TTTTTA 2- 1-
  • 21. Capture sequence strand AAAAAT 1+ 2+ TTTTTA
  • 22. Capture sequence strand AAAAAT 1- 2- TTTTTA
  • 23.
  • 24. Capture sequence length one oscillation = 100 nt
  • 25. Genome Sequencing cell population extracted DNA read pair information read sheared DNA dsDNA fragment (known size) sequencing reads (typically produce millions) AGCGGATTGCATGACAGT read GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
  • 26. Capture read pair information After building the initial single-end (SE) contigs from k-mer sequences, ABySS uses paired-end reads to resolve ambiguities.
  • 27. Capture read pair information Paired end read information is used the construct paired end (PE) contigs … 13+ 44- 46+ 4+ 79+ 70+ … blue gradient = paired end contig orange = selected single end contig
  • 28.
  • 29. contig adjacency information
  • 34. Implemented using the Java Universal Network/Graph Framework (JUNG)
  • 35. Applied the Kamada-Kawai layout algorithm (JUNG implementation)
  • 36.
  • 38.
  • 39. This representation is particularly powerful for revealing high-level genome assembly structure, not readily viewable in any other interactive tool
  • 40. Future work includes:
  • 41. support for other assembly algorithm outputs
  • 43.
  • 44. Genome Sequencing cell population extracted DNA sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
  • 45. Genome Sequencing cell population extracted DNA sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG
  • 46. Genome Sequencing cell population Chromatin Immunoprecipitationand Sequencing (ChIP-Seq) extracted DNA selection sheared DNA sequencing reads (typically produce millions) AGCGGATTGCATGACAGT GTACAGCCTGACAGAAGC GCGCTACGATCAGATCAA GTACAGCCTGACAGAAGC CATGACAGTCCGAGTACA TTCAGAATGGTACAGCAG TTCAGAATGGTACAGCAG
  • 47. Align sequences to the genome CCGAGTACAGCCTGACAGA GCATGACAGTCCGAGTAC TTGCATGACAGTCCGAGT AGCGGATTGCATGACAGT AGCGGATTGCATGACAGT AGCGGATTGCATGACAGT Reference Genome AGCGGATTGCATGACAGTCCGAGTACAGCCTGACAGA Read coverage Genomic coordinate
  • 48. Genome browser can reveal local patterns H3K4me3 H3K36me3 H3K27me3 H3K9me3 H3K9Ac MRE
  • 49. Difficult to get global overview
  • 50. Focus on regions of interest 1. For example, transcriptional start sites (TSS +/- 3000 nt) H3K4me3 H3K9Ac H3K4me1 H3K36me3 MeDIP MRE 2. Extract data matrices Normalization for bin i, sample h: 3. Cluster matrices (k-means clustering with Euclidean distance)
  • 51. Focus on regions of interest 1. For example, transcriptional start sites (TSS +/- 3000 nt) H3K4me3 H3K9Ac H3K4me1 H3K36me3 MeDIP MRE 2. Extract data matrices Normalization for bin i, sample h: 3. Cluster matrices (k-means clustering with Euclidean distance)
  • 52. Focus on regions of interest 1. For example, transcriptional start sites (TSS +/- 3000 nt) H3K4me3 H3K9Ac H3K4me1 H3K36me3 MeDIP MRE 2. Extract data matrices Normalization for bin i, sample h: 3. Cluster matrices (k-means clustering with Euclidean distance)
  • 53. Enable interactive exploration 4. Interactive cluster visualization (data from H1 cells) cluster size indicator (total n= 15,618) H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA cluster (average values displayed) individual TSS H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA HOXC12 gene scroll bar to explore all cluster members 5. Link-out to UCSC genome browser
  • 54. Enable interactive exploration 4. Interactive cluster visualization (data from H1 cells) cluster size indicator (total n= 15,618) H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA cluster (average values displayed) individual TSS H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA scroll bar to explore all cluster members
  • 55. Enable interactive exploration 4. Interactive cluster visualization (data from H1 cells) cluster size indicator (total n= 15,618) H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA cluster (average values displayed) individual TSS H3K4me3 H3K9Ac H3K4me1 H3K36me3 H3K27me3 H3K9me3 MeDIP MRE mRNA HOXC12 gene scroll bar to explore all cluster members 5. Link-out to UCSC genome browser
  • 56.
  • 57.
  • 58. Access to both global and detailed view is valuable
  • 59. Future work includes:
  • 60. search functionality (e.g. by region id)
  • 61. integration with other clustering tools
  • 62.
  • 63.
  • 64.
  • 65. Complementing Computation with Visualization in Genomics March 11, 2010 Cydney Nielsen BC Cancer Agency Genome Sciences Centre Vancouver, Canada