SlideShare ist ein Scribd-Unternehmen logo
1 von 36
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;
This presentation. Provided that:
You attribute the work to its author and respect the rights and
licenses associated with its components.
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero.
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at;
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
@bffo on
The OICR and The International Cancer
Genomics Consortium
October 22th 2012 B.F. Francis Ouellette francis@oicr.on.ca
• Associate Director, Informatics & Biocomputing,
Ontario Institute for Cancer Research, Toronto, ON
• Associate Professor, Department of Cell and Systems
Biology, University of Toronto, Toronto, ON.@bffo on
Outline
• OICR’s mission
• ICGC’s goal
• OICR and ICGC: Open Access/Open Source shop
• ICGC: the DCC
• OICR: Processing Cancer Genomes
• You: getting access to the data
OICR’s mission
To build innovative research
programs that will have an impact
on the prevention, early detection,
diagnosis and treatment of
cancer.
ICGC’s Goal:
To obtain a comprehensive
description of genomic,
transcriptomic and epigenomic
changes in 50 different tumor
types and/or subtypes which are
of clinical and societal importance
across the globe.
Cancer
A Disease of the Genome
Challenge in Treating Cancer:
 Every tumor is different
 Every cancer patient is different
47 Projects 12 countries 23,408 tumor samples
planned
OICR Policies on Open Access Publication and
Data Retention
• To allow and promote access to research outputs
funded by OICR, thus increasing the diffusion and
impact of the research process.
• All papers will be freely available through the
internet within six (6) months of publication.
• OICR will not violate the Publisher’s embargo
policy on free access
• OICR encourages OA publication, but is also
developing an Institutional Repository (IR) where
research output will be found
10
ICGC – March 2012
Commitments for 22,179 tumor genomes!
New
Saudi Arabia
Thyroid
South Korea
Breast
AU/UK/US
Mesothelioma
375
4375
4900
5500
10229
10979
12979
19229
19629
20629
21129
22179
22179
0
5000
10000
15000
20000
25000
Mar-04
Jul-04
Nov-04
Mar-05
Jul-05
Nov-05
Mar-06
Jul-06
Nov-06
Mar-07
Jul-07
Nov-07
Mar-08
Mar-08
Jul-08
Nov-08
Jul-09
Nov-09
Mar-10
Jul-10
Nov-10
Mar-11
Jul-11
Nov-11
Mar-12
Completeness of Data for Genomic Analysis Types
in DCC Datasets (ICGC 10)
11
# donors
Copy Number Alterations
Structural Variation
Gene Expression
miRNA Expression
Simple Somatic Mutations
Splicing Variation
DNA Methylation
Brett Whitty
Completeness of Genomic Analysis Data Types in
DCC Datasets
12 miRNA ExpressionSimple Somatic Mutations Splicing Variation DNA MethylationCopy Number Alterations Structural Variation Gene Expression
Brett Whitty
Completeness of Genomic Analysis Data Types in
DCC Datasets
13 miRNA ExpressionSimple Somatic Mutations Splicing Variation DNA MethylationCopy Number Alterations Structural Variation Gene Expression
Brett Whitty
http://www.ncbi.nlm.nih.gov/bioproject
ICGC Data Categories
ICGC Open Access Datasets ICGC Controlled Access Datasets
 Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
 Donor
Gender
Age range
 RNA expression (normalized)
 DNA methylation
 Genotype frequencies
 Somatic mutations (SNV,
CNV and Structural
Rearrangement)
Detailed Phenotype and Outcome Data
Patient demography
Risk factors
Examination
Surgery/Drugs/Radiation
Sample/Slide
Specific histological features
Protocol
Analyte/Aliquot
Gene Expression (probe-level data)
Raw genotype calls (germline)
Gene-sample identifier links
Genome sequence files
Most of the data in the portal is publically available without restriction. However,
access to some data, like the germline mutations, requires authorization by the Data
Access Compliance Office (DACO)
DACO/DCC User Data Access Process
• Users approved through DACO are now automatically granted access to
ICGC controlled access datasets available through the ICGC Data Portal
and the EBI’s EGA repository
16
DACO Web
Application
DCC User
Registry
DCC Data
Portal
EBI EGA
application
approved
by DACO
user
accounts
activated
DACO
ICGC
dbGaP
EGA
TCGA
BAM
Open
Open
ERA
BAM
VCF
BAM
+ EGA id
“The administrative efforts to access private genetic data
exact a real cost and create a drag on research efforts
creating friction in the depositing, accessing, and analyzing
of data. With many academics risk averse and cost
conscious the time and effort often necessary to access this
data will cut down on potential research efforts.”
19
OICR Sequencing/Biocomputing Platform
• 5500 cores
• 185 nodes with 16 GB RAM
• 221 nodes with 24 GB RAM
• 32 nodes with 96 GB RAM
• 5 nodes with 256 GB RAM
• 2.5PB of online storage
• 1Gb, 10Gnnectivity
> 17 terabases per month
> 2,800 human genomes
capacity and growing
(70 genomes at 40X)
Life Tech Solid 5500
GAII
Illumina HiSeq 2000
Pac Bio
John McPherson
5500 cores
185 nodes with 16 GB RAM
221 nodes with 24 GB RAM
32 nodes with 96 GB RAM
5 nodes with 256 GB RAM
2.5PB of online storage
1Gb, 10Gb and fibre connectivity
Ion Torent MiSeq
OICR data analysis pipeline
• Like most genome/bioinformatics centers, we are
fully dependent on OS NGS bioinformatics tools.
• We all depend on:
– SeqAnswers.com
– biostars.org
• Pipelines are necessary because they:
– Are more scalable
– Are more recordable
– Are more reproducible
– Are more robust
– … and can keep you sane!
http://seqware.github.com/
SeqWare: http://seqware.github.com/about/
What do we do to maximize good calls?
• Minimal coverage of tumor and germline for exome:
– 200x germline
– 150x tumor
• Minimum quality score
• Simultaneous alignment of reference, normal and
tumor
• Blacklist “bad” regions
• Remove suspiciously dense clusters of mutations
(perhaps too aggressive)
• Validate, validate, validate!
• Future ideas
– Assemble germline first, then align tumour to germline
– Build patient-specific blacklist
Exome Sequencing Pipeline
Align sequencing reads by Novoalign
Merge & collapse reads by Picard
Recalibrate quality score & perform
local realignment by GATK
Call variants by GATK
Call somatic & germline mutations by in-house algorithm
Filter mutations
• with >5% frequency in dbSNP
• with strand biases
• in regions with segmental duplication & simple repeats
• that are false positives
Validate mutations by Ion Torrent
Exome capture by Agilent SureSelect
Human All Exon 50Mb Kits
Paired-end sequencing on Illumina HiSeq
150x coverage tumor, 200x coverage germline
Filter unmapped, non-primary and non-
uniquely mapped reads by Samtools
Validation Strategy
• false positives
• false negatives
• Validation rate was an average of 87%
• No correlation between cellularity and validation rate
indicating that the pipeline calls SNVs accurately
irrespective of cellularity
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 20% 40% 60% 80% 100%
Validation Rate
Cellularity
Lincoln Stein
0
1
2
3
4
5
6
7
8
9
10
0
5
10
15
20
25
30
Number of Specimens% Speciemns
Mutation Frequency (after validation)
+ 392 genes mutated in 1 specimen
Lincoln Stein
KRAS (mutated in 9 samples)
– Signal transduction for many growth factors.
– Activating G12{V,S,R,D,C,A} and
Q61{H,K,L,R} mutations common in cancers.
– Expected to be mutated in ~90% of pancreatic
cancers; we only see it in <30% of
primaries, but can find in nearly all tumors on
deep sequencing (false neg rate > 60%)
Lincoln Stein
Next Steps
• SNVs
– Deep sequencing of all primaries across all genes identified in
initial screen as carrying a mutant to characterize patterns of
mutation.
– Exome sequencing of remaining specimens, including
xenografts & cell lines.
– Lab is developing protocols for laser capture in order to
increase sample cellularity.
• Structural Variation
– Exhaustive benchmarking of SV calling pipelines in progress.
• Methylation
– Lab is testing protocols for bisulphite conversion sequencing
& MeDIP.
• Transcriptome
– RNA-seq of selected cell lines under way.
So, what next on analysis of our cancer samples?
• Doing better automation, and pipeline engineering
• We want to do more transcriptome, and integrate
better with other pipelines (SNV, CNV, SV and
epigenomic analyses).
• Formalizes ICGC procedures, and publish them.
• Need to consider genes that are not there (not
detected, or not able to be detected), and
transcriptome will help with this. Important for the
network analysis.
• Also need to build models
– That take into account low abundance and complexity of
samples with low cellularity
– That take into account the average of multiple samples (plan
for 350, but will there be tumor subtypes?)
– New project: Personal Human Proteome data
31 31
32
Data portal: http://dcc.icgc.org/
Acknowledgements
Project leaders at the OICR:
Tom Hudson
John McPherson
Lincoln Stein
Paul Boutros
Lakshmi Mutsawarma
Vincent Ferretti
ICGC Database Developers:
• Anthony Cros
• Jonathan Guberman
• Yong Liang
• Long Yao
• Shane Wilson
• Zhang Junjun
• Brian O’Connor
Ouellette Lab
• Emilie Chautard
• Michelle Brazas
• Nina Palikuca
WebDev group:
• Joseph Yamada
• Kamen Wu
• Miyuki Fukuma
• Salman Badr
• Stuart Lawler
Pipeline Dev. & Eval.
• Morgan Taschuk
• Peter Ruzanov
• Rob Denroche
• Zhibin Lu
ICGC DCC staff:
• Brett Whitty
• Marie Wong-Erasmus
http://oicr.on.ca http://icgc.org
Sequence Informatics
• Tim Beck
• Tony de Bat
• Zheng Zha
• Fouad Yousif
• Xuemei Luo
Pancreatic Analysis WG
• Carson Holt
• Irina Kalatskaya
• Christina Yung
• Kim Begley
• Adam Wright
SeqWare group
• Brian O’Connor
• Dennis Yean
• Yong Liang
ICGC DCC Curation is Hiring!
• We’re looking for people with a strong
genomics/bioinformatics background and
experience working with large genome projects
(with a web resource component)
Lots of data and lots of
great work to do!
francis@oicr.on.ca
34
Informatics and Biocomputing Program at the OICR
Pascale et Maya

Weitere ähnliche Inhalte

Was ist angesagt?

Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenomeInABottle
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.cafionabrinkman
 
Free webinar-introduction to bioinformatics - biologist-1
Free webinar-introduction to bioinformatics - biologist-1Free webinar-introduction to bioinformatics - biologist-1
Free webinar-introduction to bioinformatics - biologist-1Elia Brodsky
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Neuro, McGill University
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)Elia Brodsky
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformaticsChris Dwan
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3GenomeInABottle
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsLeighton Pritchard
 
Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Akash Arora
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)Pistoia Alliance
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in BioinformaticsAli Kishk
 
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Functional Genomics Data Society
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformaticsKamlesh Patade
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformaticsphilmaweb
 
Multi-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsMulti-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsChristoph Steinbeck
 

Was ist angesagt? (20)

Madrid icgc pcawg_2016_slideshare
Madrid icgc pcawg_2016_slideshareMadrid icgc pcawg_2016_slideshare
Madrid icgc pcawg_2016_slideshare
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
 
Free webinar-introduction to bioinformatics - biologist-1
Free webinar-introduction to bioinformatics - biologist-1Free webinar-introduction to bioinformatics - biologist-1
Free webinar-introduction to bioinformatics - biologist-1
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformatics
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in Bioinformatics
 
Cshl minseqe 2013_ouellette
Cshl minseqe 2013_ouelletteCshl minseqe 2013_ouellette
Cshl minseqe 2013_ouellette
 
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformatics
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Multi-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsMulti-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application Domains
 

Ähnlich wie Share and Adapt Creative Works

Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Robert (Rob) Salomon
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use casesGuy Coates
 
Canopy BioSciences August 2017
Canopy BioSciences August 2017Canopy BioSciences August 2017
Canopy BioSciences August 2017Jens-Ole Bock
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
Using research software in a production environment
Using research software in a production environmentUsing research software in a production environment
Using research software in a production environmentMorgan Taschuk
 
How Waves of Innovation in Biotechnology Shaped a Small Business Venture
How Waves of Innovation in Biotechnology Shaped a Small Business VentureHow Waves of Innovation in Biotechnology Shaped a Small Business Venture
How Waves of Innovation in Biotechnology Shaped a Small Business VentureChristopher Kemp
 
Accurate Cell Counters for CAR T Therapy
Accurate Cell Counters for  CAR T TherapyAccurate Cell Counters for  CAR T Therapy
Accurate Cell Counters for CAR T Therapynexcelom
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Accurate Cell Counters for CAR T Therapy
Accurate Cell Counters for  CAR T TherapyAccurate Cell Counters for  CAR T Therapy
Accurate Cell Counters for CAR T TherapyNexcelom-Bioscience
 
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Maki Ogawa
 
20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslidesMaki Ogawa
 
US Pharma presentation on clone screen strategy for monoclonality using Solen...
US Pharma presentation on clone screen strategy for monoclonality using Solen...US Pharma presentation on clone screen strategy for monoclonality using Solen...
US Pharma presentation on clone screen strategy for monoclonality using Solen...IanTaylor50
 
Isolating Primary Cells 101
Isolating Primary Cells 101Isolating Primary Cells 101
Isolating Primary Cells 101allcells
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesConnected Data World
 
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics WorkshopLopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics WorkshopNuria Lopez-Bigas
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseNathan Olson
 

Ähnlich wie Share and Adapt Creative Works (20)

Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use cases
 
Canopy BioSciences August 2017
Canopy BioSciences August 2017Canopy BioSciences August 2017
Canopy BioSciences August 2017
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
Using research software in a production environment
Using research software in a production environmentUsing research software in a production environment
Using research software in a production environment
 
How Waves of Innovation in Biotechnology Shaped a Small Business Venture
How Waves of Innovation in Biotechnology Shaped a Small Business VentureHow Waves of Innovation in Biotechnology Shaped a Small Business Venture
How Waves of Innovation in Biotechnology Shaped a Small Business Venture
 
Accurate Cell Counters for CAR T Therapy
Accurate Cell Counters for  CAR T TherapyAccurate Cell Counters for  CAR T Therapy
Accurate Cell Counters for CAR T Therapy
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Accurate Cell Counters for CAR T Therapy
Accurate Cell Counters for  CAR T TherapyAccurate Cell Counters for  CAR T Therapy
Accurate Cell Counters for CAR T Therapy
 
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
 
20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides
 
US Pharma presentation on clone screen strategy for monoclonality using Solen...
US Pharma presentation on clone screen strategy for monoclonality using Solen...US Pharma presentation on clone screen strategy for monoclonality using Solen...
US Pharma presentation on clone screen strategy for monoclonality using Solen...
 
Isolating Primary Cells 101
Isolating Primary Cells 101Isolating Primary Cells 101
Isolating Primary Cells 101
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical Sciences
 
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics WorkshopLopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 

KĂźrzlich hochgeladen

Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Servicevidya singh
 
Chandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableChandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableDipal Arora
 
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableDipal Arora
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...Arohi Goyal
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...Garima Khatri
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipurparulsinha
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...astropune
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsBangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsGfnyt
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escortsvidya singh
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...narwatsonia7
 
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomLucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomdiscovermytutordmt
 
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Dipal Arora
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋TANUJA PANDEY
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...chandars293
 
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escortsaditipandeya
 

KĂźrzlich hochgeladen (20)

Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
Chandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableChandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD available
 
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
 
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsBangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 7001035870  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 7001035870 Meetin With Bangalore Esc...
 
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kochi Just Call 9907093804 Top Class Call Girl Service Available
 
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomLucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
 
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...
 
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
 

Share and Adapt Creative Works

  • 1. You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at; http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  • 3. The OICR and The International Cancer Genomics Consortium October 22th 2012 B.F. Francis Ouellette francis@oicr.on.ca • Associate Director, Informatics & Biocomputing, Ontario Institute for Cancer Research, Toronto, ON • Associate Professor, Department of Cell and Systems Biology, University of Toronto, Toronto, ON.@bffo on
  • 4. Outline • OICR’s mission • ICGC’s goal • OICR and ICGC: Open Access/Open Source shop • ICGC: the DCC • OICR: Processing Cancer Genomes • You: getting access to the data
  • 5. OICR’s mission To build innovative research programs that will have an impact on the prevention, early detection, diagnosis and treatment of cancer.
  • 6. ICGC’s Goal: To obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe.
  • 7. Cancer A Disease of the Genome Challenge in Treating Cancer:  Every tumor is different  Every cancer patient is different
  • 8. 47 Projects 12 countries 23,408 tumor samples planned
  • 9. OICR Policies on Open Access Publication and Data Retention • To allow and promote access to research outputs funded by OICR, thus increasing the diffusion and impact of the research process. • All papers will be freely available through the internet within six (6) months of publication. • OICR will not violate the Publisher’s embargo policy on free access • OICR encourages OA publication, but is also developing an Institutional Repository (IR) where research output will be found
  • 10. 10 ICGC – March 2012 Commitments for 22,179 tumor genomes! New Saudi Arabia Thyroid South Korea Breast AU/UK/US Mesothelioma 375 4375 4900 5500 10229 10979 12979 19229 19629 20629 21129 22179 22179 0 5000 10000 15000 20000 25000 Mar-04 Jul-04 Nov-04 Mar-05 Jul-05 Nov-05 Mar-06 Jul-06 Nov-06 Mar-07 Jul-07 Nov-07 Mar-08 Mar-08 Jul-08 Nov-08 Jul-09 Nov-09 Mar-10 Jul-10 Nov-10 Mar-11 Jul-11 Nov-11 Mar-12
  • 11. Completeness of Data for Genomic Analysis Types in DCC Datasets (ICGC 10) 11 # donors Copy Number Alterations Structural Variation Gene Expression miRNA Expression Simple Somatic Mutations Splicing Variation DNA Methylation Brett Whitty
  • 12. Completeness of Genomic Analysis Data Types in DCC Datasets 12 miRNA ExpressionSimple Somatic Mutations Splicing Variation DNA MethylationCopy Number Alterations Structural Variation Gene Expression Brett Whitty
  • 13. Completeness of Genomic Analysis Data Types in DCC Datasets 13 miRNA ExpressionSimple Somatic Mutations Splicing Variation DNA MethylationCopy Number Alterations Structural Variation Gene Expression Brett Whitty
  • 15. ICGC Data Categories ICGC Open Access Datasets ICGC Controlled Access Datasets  Cancer Pathology Histologic type or subtype Histologic nuclear grade  Donor Gender Age range  RNA expression (normalized)  DNA methylation  Genotype frequencies  Somatic mutations (SNV, CNV and Structural Rearrangement) Detailed Phenotype and Outcome Data Patient demography Risk factors Examination Surgery/Drugs/Radiation Sample/Slide Specific histological features Protocol Analyte/Aliquot Gene Expression (probe-level data) Raw genotype calls (germline) Gene-sample identifier links Genome sequence files Most of the data in the portal is publically available without restriction. However, access to some data, like the germline mutations, requires authorization by the Data Access Compliance Office (DACO)
  • 16. DACO/DCC User Data Access Process • Users approved through DACO are now automatically granted access to ICGC controlled access datasets available through the ICGC Data Portal and the EBI’s EGA repository 16 DACO Web Application DCC User Registry DCC Data Portal EBI EGA application approved by DACO user accounts activated
  • 18. “The administrative efforts to access private genetic data exact a real cost and create a drag on research efforts creating friction in the depositing, accessing, and analyzing of data. With many academics risk averse and cost conscious the time and effort often necessary to access this data will cut down on potential research efforts.”
  • 19. 19 OICR Sequencing/Biocomputing Platform • 5500 cores • 185 nodes with 16 GB RAM • 221 nodes with 24 GB RAM • 32 nodes with 96 GB RAM • 5 nodes with 256 GB RAM • 2.5PB of online storage • 1Gb, 10Gnnectivity > 17 terabases per month > 2,800 human genomes capacity and growing (70 genomes at 40X) Life Tech Solid 5500 GAII Illumina HiSeq 2000 Pac Bio John McPherson 5500 cores 185 nodes with 16 GB RAM 221 nodes with 24 GB RAM 32 nodes with 96 GB RAM 5 nodes with 256 GB RAM 2.5PB of online storage 1Gb, 10Gb and fibre connectivity Ion Torent MiSeq
  • 20. OICR data analysis pipeline • Like most genome/bioinformatics centers, we are fully dependent on OS NGS bioinformatics tools. • We all depend on: – SeqAnswers.com – biostars.org • Pipelines are necessary because they: – Are more scalable – Are more recordable – Are more reproducible – Are more robust – … and can keep you sane!
  • 23.
  • 24. What do we do to maximize good calls? • Minimal coverage of tumor and germline for exome: – 200x germline – 150x tumor • Minimum quality score • Simultaneous alignment of reference, normal and tumor • Blacklist “bad” regions • Remove suspiciously dense clusters of mutations (perhaps too aggressive) • Validate, validate, validate! • Future ideas – Assemble germline first, then align tumour to germline – Build patient-specific blacklist
  • 25. Exome Sequencing Pipeline Align sequencing reads by Novoalign Merge & collapse reads by Picard Recalibrate quality score & perform local realignment by GATK Call variants by GATK Call somatic & germline mutations by in-house algorithm Filter mutations • with >5% frequency in dbSNP • with strand biases • in regions with segmental duplication & simple repeats • that are false positives Validate mutations by Ion Torrent Exome capture by Agilent SureSelect Human All Exon 50Mb Kits Paired-end sequencing on Illumina HiSeq 150x coverage tumor, 200x coverage germline Filter unmapped, non-primary and non- uniquely mapped reads by Samtools
  • 26. Validation Strategy • false positives • false negatives • Validation rate was an average of 87% • No correlation between cellularity and validation rate indicating that the pipeline calls SNVs accurately irrespective of cellularity 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0% 20% 40% 60% 80% 100% Validation Rate Cellularity Lincoln Stein
  • 27. 0 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 25 30 Number of Specimens% Speciemns Mutation Frequency (after validation) + 392 genes mutated in 1 specimen Lincoln Stein
  • 28. KRAS (mutated in 9 samples) – Signal transduction for many growth factors. – Activating G12{V,S,R,D,C,A} and Q61{H,K,L,R} mutations common in cancers. – Expected to be mutated in ~90% of pancreatic cancers; we only see it in <30% of primaries, but can find in nearly all tumors on deep sequencing (false neg rate > 60%) Lincoln Stein
  • 29. Next Steps • SNVs – Deep sequencing of all primaries across all genes identified in initial screen as carrying a mutant to characterize patterns of mutation. – Exome sequencing of remaining specimens, including xenografts & cell lines. – Lab is developing protocols for laser capture in order to increase sample cellularity. • Structural Variation – Exhaustive benchmarking of SV calling pipelines in progress. • Methylation – Lab is testing protocols for bisulphite conversion sequencing & MeDIP. • Transcriptome – RNA-seq of selected cell lines under way.
  • 30. So, what next on analysis of our cancer samples? • Doing better automation, and pipeline engineering • We want to do more transcriptome, and integrate better with other pipelines (SNV, CNV, SV and epigenomic analyses). • Formalizes ICGC procedures, and publish them. • Need to consider genes that are not there (not detected, or not able to be detected), and transcriptome will help with this. Important for the network analysis. • Also need to build models – That take into account low abundance and complexity of samples with low cellularity – That take into account the average of multiple samples (plan for 350, but will there be tumor subtypes?) – New project: Personal Human Proteome data
  • 31. 31 31
  • 33. Acknowledgements Project leaders at the OICR: Tom Hudson John McPherson Lincoln Stein Paul Boutros Lakshmi Mutsawarma Vincent Ferretti ICGC Database Developers: • Anthony Cros • Jonathan Guberman • Yong Liang • Long Yao • Shane Wilson • Zhang Junjun • Brian O’Connor Ouellette Lab • Emilie Chautard • Michelle Brazas • Nina Palikuca WebDev group: • Joseph Yamada • Kamen Wu • Miyuki Fukuma • Salman Badr • Stuart Lawler Pipeline Dev. & Eval. • Morgan Taschuk • Peter Ruzanov • Rob Denroche • Zhibin Lu ICGC DCC staff: • Brett Whitty • Marie Wong-Erasmus http://oicr.on.ca http://icgc.org Sequence Informatics • Tim Beck • Tony de Bat • Zheng Zha • Fouad Yousif • Xuemei Luo Pancreatic Analysis WG • Carson Holt • Irina Kalatskaya • Christina Yung • Kim Begley • Adam Wright SeqWare group • Brian O’Connor • Dennis Yean • Yong Liang
  • 34. ICGC DCC Curation is Hiring! • We’re looking for people with a strong genomics/bioinformatics background and experience working with large genome projects (with a web resource component) Lots of data and lots of great work to do! francis@oicr.on.ca 34
  • 35. Informatics and Biocomputing Program at the OICR