SlideShare ist ein Scribd-Unternehmen logo
1 von 43
1
“All of your answers are approximate,
you might as well live with it…”
2
Andrew Rau-Chaplin, 1½ hours ago
Integrated Rapid Infectious Disease Analysis
www.irida.ca
Rob Beiko
Faculty of Computer Science
Dalhousie University
June 12
Microbial genomics
for rapid investigation
of infectious disease
Image © Kenneth Todar
2009 and Influenza A
4
5
6
7
Influenza A
RNA genome (14,000 nucleotides)
Eight segments
(Image: Tao and Zheng, Science 2012)
S. Typhi CT18
DNA genome (~5,100,000 nucleotides)
One chromosome + two plasmids
Science (2001)
VIRUS BACTERIUM
8
Outbreak
investigation
Similarities: place, time, genetics
fda.gov
2014
2010-2013
Inns et al. (2015)
Outbreak investigation in Canada
9
NATIONAL MICROBIOLOGY LABORATORY
PROVINCIAL PUBLIC
HEALTH LABORATORIES
CLINICAL ISOLATES
SENTINEL SURVEILLANCE
(FoodNet Canada)
CLINICAL, FOOD,
ENVIRONMENTAL
CANADIAN FOOD
INSPECTION AGENCY
(Regulatory)
FOOD ISOLATES
LISTERIA - E. COLI O157:H7 - SALMONELLA - SHIGELLA
PFGE/MLVA
PUBLIC HEALTH ACTION
10
Pulsed Field Gel Electrophoresis
Serratia - NICU
Jang et al., J Hosp Infect (2001)
11
15 gigabases per run
$1000 - $1500 / run, 1 day
Tinier pieces (150 – 400 bases)
< 1 kilobase per run
$2 / run, 1-3 hours (96 in parallel)
Tiny pieces (600 – 1000 bases)
2011: Illumina MiSeq1977: Sanger sequencing ( )
DNA Sequencing
10/10/2013 VanBUG 12
MiSeq projects at Dalhousie
• Bedford Basin microbial monitoring
• Pediatric Crohn’s disease samples
• Global microbial air sampling
• Mink genomes
• Sequencing Lactobacillus genomes from the poop of
old mice
• Wastewater diversity and function in the Arctic
• Verifying ingredients in dog food ( )
• Exercise and the Microbiome
13
Integrated Rapid Infectious Disease Analysis
www.irida.ca
14
 1.56M, 3-year Genome Canada Large-Scale Applied
Platform Grant
 SFU / BCCDC / PHAC-NML / Dalhousie
 DNA sequencing and downstream applications
• data management / federation
• analysis workflows
• ontologies
• APIs
• 3rd-party applications
 Implementation in provincial public health labs
 Training
15
Five Pillars of IRIDA
16
 Ontologies and data standards
 NCBI, MiXS, vegetables
 Metadata
 Data provenance
 Data quality
 Environmental information
Data sharing!
• BIG challenges – different jurisdictions,
“ownership” of epi data. Privacy!
• Health service providers – concerns about
privacy and data breach
• Technology outstrips policy
• What digital records could we get TODAY?
• Canada lagging in data sharing
17
18
 Calling isolates based on
genetic variation
 Traditional:
 Pulsed-field
 Multi-locus (standards! mlst.net)
 Whole genomes:
 Lots of information!
 Too much information!
 Lots of filtering and quality
control required
19
 Workflow management
 REST-like API (3rd – party
applications)
 Security: authentication /
authorization
 Data models &
implementation
Local Storage
Remote APIs
IRIDA’s Federated Design
List Samples
20
21
 Each pipeline is implemented
as a Galaxy workflow
 Internal analysis pipelines
 Assembly and annotation
 Phylogenetics
 “Line list” management
 3rd-party applications
22
Sampled genomes Quality control Tree generation /
visualization
Single-Nucleotide Variant Phylogenetic Pipeline
(SNVPhyl)
23
GenGIS
Data from Haiti cholera outbreak, 2010
http://kiwi.cs.dal.ca/GenGIS
IslandViewer
24
http://www.pathogenomics.sfu.ca/islandviewer/browse
25
 Interfaces / environment
 Personas
 Researchers
 Epidemiologists
 Clinical microbiologists / lab technicians
 Workflow design and
execution
Full Privileges
Cluster
Line
List ID
Patient
Name
Prov.
Health
No.
Age Sex Location
Sample
ID
Collection
Date
Culture
Result
A 1
John
Smith
4513253244 26 M Vancouver F14231 14/03/21
Salmonella
sp.
A 2
Sally
Smith
4519567458 24 F Vancouver F14235 14/03/21
Salmonella
sp.
B 3
Tom
Jones
4517543216 35 M Vancouver M6542 14/03/24
Salmonella
sp.
B 4
Helen
Jones
9856321124 35 F Vancouver S1245 14/03/22
Salmonella
sp.
C 5
Jennifer
Lee
4516853122 29 F Vancouver S5642 14/03/22
Salmonella
sp.
C 6
Michael
Brown
9456534561 45 M Victoria T68954 14/03/25
Salmonella
sp.
Phylogenetic
Tree
Genetic Distance
Limited Privileges
Cluster
Line
List ID
Patient
Name
Prov.
Health
No.
Age Sex Location
Sample
ID
Collection
Date
Culture
Result
A 1
John
Smith
4513253244 26 M Vancouver F14231 14/03/21
Salmonella
sp.
A 2
Sally
Smith
4519567458 24 F Vancouver F14235 14/03/21
Salmonella
sp.
B 3
Tom
Jones
4517543216 35 M Vancouver M6542 14/03/24
Salmonella
sp.
B 4
Helen
Jones
9856321124 35 F Vancouver S1245 14/03/22
Salmonella
sp.
C 5
Jennifer
Lee
4516853122 29 F Vancouver S5642 14/03/22
Salmonella
sp.
C 6
Michael
Brown
9456534561 45 M Victoria T68954 14/03/25
Salmonella
sp.
Phylogenetic
Tree
Genetic Distance
Large-scale sequencing initiatives
28
en.wikipedia.org
FDA GenomeTrakr
29
http://www.fda.gov/Food/FoodScienceResearch/WholeGenomeSequencingProgramWGS/ucm363134.htm
Public Health England project
(>10,000 Salmonella so far)
• As of 2015, sequencing every sampled Salmonella
isolate collected in England
• Over 10,000 sequenced to date
• 8000 already available for download in the public
databases
30
Gary van Domselaar, NML
31
The Global Microbial Identifier
32
What’s next?
??? per run
$900 / run, 6 hours
Huge pieces (max so far – 200-300 kilobases)
Can stop / restart using same disposable flowcell
2015: Oxford Nanopore MinION
15 cm (-ish)
thehightechsociety.com
Quick et al. (2015)
“Using a novel streaming phylogenetic
placement method samples can be
assigned to a serotype in 40 minutes and
determined to be part of the outbreak in less
than 2 h.”
33
Ebola monitoring
34
blogs.biomedcentral.com
Joshua Quick, Nick Loman
Example workflow
35
6 hrs
Change
flowcell
Samples evaluated against reference in real time
Positive ID /
placement
Load DNA
    
Challenges
• Sample extraction: getting DNA from stuff
• Clinical-grade evaluation
• Training
• Equipment reliability
• Sequencing errors
• Quality of reference data / attribution algorithms
• Database updates in real time
• Ethics / privacy (Genomes Sequenced While U Wait)
36
The Point
37
Comprehensive monitoring
Accurate typing
Rapid identification
Real-time decision making
Acknowledgements
PIs
Fiona Brinkman – SFU
Will Hsiao – PHMRL
Gary Van Domselaar – NML
Morag Graham - NML
Rob Beiko – Dalhousie
University of Lisbon
Joᾶo Carriҫo
National Microbiology Laboratory (NML)
Franklin Bristow
Aaron Petkau
Thomas Matthews
Josh Adam
Adam Olsen
Tara Lynch
Shaun Tyler
Philip Mabon
Philip Au
Celine Nadon
Matthew Stuart-Edwards
Chrystal Berry
Lorelee Tschetter
Laboratory for Foodborne Zoonoses (LFZ)
Eduardo Taboada
Peter Kruczkiewicz
Chad Laing
Vic Gannon
Matthew Whiteside
Ross Duncan
Steven Mutschall
Simon Fraser University (SFU)
Melanie Courtot
Emma Griffiths
Geoff Winsor
Julie Shay
Matthew Laird
Bhav Dhillon
Raymond Lo
BC Public Health Microbiology &
Reference Laboratory (PHMRL) and BC
Centre for Disease Control (BCCDC)
Judy Isaac-Renton
Patrick Tang
Natalie Prystajecky
Jennifer Gardy
Damion Dooley
Linda Hoang
Kim MacDonald
Yin Chang
Eleni Galanis
Marsha Taylor
Cletus D’Souza
Ana Paccagnella
University of Maryland
Lynn Schriml
Canadian Food Inspection Agency (CFIA)
Burton Blais
Catherine Carrillo
Dominic Lambert
Dalhousie University
Alex Keddy 38
McMaster University
Andrew McArthur
Daim Sardar
European Nucleotide Archive
Guy Cochrane
Petra ten Hoopen
Clara Amid
European Food Safety Agency
Leibana Criado Ernesto
Vernazza Francesco
Rizzi Valentina
39
Seminar from the Will Hsiao,
BC Centres for Disease Control
40
Materials to be available on
http://bioinformatics.ca/
June 24-26, 2015
The Bioinformatics Exam of the Future
41
tagc.com.au
commons.wikimedia.org/wiki/File:DNA_ahelatest_moodustunud_niit_katsuti_korgil..JPG
http://omicfrontiers.com/2014/06/11/diaryofaminion_part2/
2009 was a long time ago
42
J. Craig Venter Institute
43Photo credit: Emma Allen-Vercoe
Some slides courtesy of Gary Van Domselaar, NML

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
 
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
 
0301 ostrer
0301   ostrer0301   ostrer
0301 ostrer
 
Applications of Whole Genome Sequencing (WGS) technology on food safety manag...
Applications of Whole Genome Sequencing (WGS) technology on food safety manag...Applications of Whole Genome Sequencing (WGS) technology on food safety manag...
Applications of Whole Genome Sequencing (WGS) technology on food safety manag...
 
GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Proof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal diseaseProof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal disease
 
Genomics: The coming challenge to the health system
Genomics: The coming challenge to the health systemGenomics: The coming challenge to the health system
Genomics: The coming challenge to the health system
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
mHealth Israel_Ryo Kosaka_AIST_National Institute of Advanced Industrial Scie...
mHealth Israel_Ryo Kosaka_AIST_National Institute of Advanced Industrial Scie...mHealth Israel_Ryo Kosaka_AIST_National Institute of Advanced Industrial Scie...
mHealth Israel_Ryo Kosaka_AIST_National Institute of Advanced Industrial Scie...
 
Making your science powerful : an introduction to NGS experimental design
Making your science powerful : an introduction to NGS experimental designMaking your science powerful : an introduction to NGS experimental design
Making your science powerful : an introduction to NGS experimental design
 
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
Overview of the ECDC whole genome sequencing strategy
Overview of the ECDC whole genome sequencing strategyOverview of the ECDC whole genome sequencing strategy
Overview of the ECDC whole genome sequencing strategy
 
Sequencing and Beyond?
Sequencing and Beyond?Sequencing and Beyond?
Sequencing and Beyond?
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
Prashant esa2017
Prashant esa2017Prashant esa2017
Prashant esa2017
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
 
Biochemistry: A pivotal aspects in forensic science
Biochemistry: A pivotal aspects in forensic scienceBiochemistry: A pivotal aspects in forensic science
Biochemistry: A pivotal aspects in forensic science
 

Ähnlich wie 2015 06-12-beiko-irida-big data

The Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham TaylorThe Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham Taylor
Human Variome Project
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Surya Saha
 

Ähnlich wie 2015 06-12-beiko-irida-big data (20)

Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
 
The Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham TaylorThe Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham Taylor
 
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...
 
Domselaar GMI8 Beijing Canadian WGS Surveillance Experience
Domselaar GMI8 Beijing Canadian WGS Surveillance ExperienceDomselaar GMI8 Beijing Canadian WGS Surveillance Experience
Domselaar GMI8 Beijing Canadian WGS Surveillance Experience
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
JALANov2000
JALANov2000JALANov2000
JALANov2000
 
InSyBio at Open Coffee Athens CI
InSyBio at Open Coffee Athens CIInSyBio at Open Coffee Athens CI
InSyBio at Open Coffee Athens CI
 
Nov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_finalNov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_final
 
C&E news talk sept 16
C&E news talk sept 16C&E news talk sept 16
C&E news talk sept 16
 
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
FROM KAMPALA TO CAPE TOWN
 FROM KAMPALA TO CAPE TOWN FROM KAMPALA TO CAPE TOWN
FROM KAMPALA TO CAPE TOWN
 
ASHG_2014_AP
ASHG_2014_APASHG_2014_AP
ASHG_2014_AP
 
Biocuration gen epio_poster
Biocuration gen epio_posterBiocuration gen epio_poster
Biocuration gen epio_poster
 
Creating a High Performance Cyberinfrastructure to Support Analysis of Illumi...
Creating a High Performance Cyberinfrastructure to Support Analysis of Illumi...Creating a High Performance Cyberinfrastructure to Support Analysis of Illumi...
Creating a High Performance Cyberinfrastructure to Support Analysis of Illumi...
 
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
 

Mehr von beiko

Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
beiko
 
Beiko smbe2013-final
Beiko smbe2013-finalBeiko smbe2013-final
Beiko smbe2013-final
beiko
 

Mehr von beiko (20)

ASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptxASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptx
 
Beiko cmo gen_epi_monday
Beiko cmo gen_epi_mondayBeiko cmo gen_epi_monday
Beiko cmo gen_epi_monday
 
Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_final
 
Biomedical data
Biomedical dataBiomedical data
Biomedical data
 
Rob csm2018
Rob csm2018Rob csm2018
Rob csm2018
 
Beiko taconic-nov3
Beiko taconic-nov3Beiko taconic-nov3
Beiko taconic-nov3
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beiko
 
GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016
 
Beiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentationBeiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentation
 
DCSI presentation 2015
DCSI presentation 2015DCSI presentation 2015
DCSI presentation 2015
 
Beiko cms final
Beiko cms finalBeiko cms final
Beiko cms final
 
Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?Is microbial ecology driven by roaming genes?
Is microbial ecology driven by roaming genes?
 
Beiko hpcs
Beiko hpcsBeiko hpcs
Beiko hpcs
 
Gene sharing in microbes: good for the individual, good for the community?
Gene sharing in microbes: good for the individual, good for the community?Gene sharing in microbes: good for the individual, good for the community?
Gene sharing in microbes: good for the individual, good for the community?
 
Beiko biogeography
Beiko biogeographyBeiko biogeography
Beiko biogeography
 
2014 04-beiko-biology
2014 04-beiko-biology2014 04-beiko-biology
2014 04-beiko-biology
 
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
 
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
 
Beiko dcsi2013
Beiko dcsi2013Beiko dcsi2013
Beiko dcsi2013
 
Beiko smbe2013-final
Beiko smbe2013-finalBeiko smbe2013-final
Beiko smbe2013-final
 

Kürzlich hochgeladen

Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Dipal Arora
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Dipal Arora
 

Kürzlich hochgeladen (20)

Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
 
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel roomLucknow Call girls - 8800925952 - 24x7 service with hotel room
Lucknow Call girls - 8800925952 - 24x7 service with hotel room
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
Bangalore Call Girls Nelamangala Number 9332606886 Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 9332606886  Meetin With Bangalore Esc...Bangalore Call Girls Nelamangala Number 9332606886  Meetin With Bangalore Esc...
Bangalore Call Girls Nelamangala Number 9332606886 Meetin With Bangalore Esc...
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
 
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
 
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Bareilly Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
 

2015 06-12-beiko-irida-big data

  • 1. 1
  • 2. “All of your answers are approximate, you might as well live with it…” 2 Andrew Rau-Chaplin, 1½ hours ago
  • 3. Integrated Rapid Infectious Disease Analysis www.irida.ca Rob Beiko Faculty of Computer Science Dalhousie University June 12 Microbial genomics for rapid investigation of infectious disease Image © Kenneth Todar
  • 5. 5
  • 6. 6
  • 7. 7 Influenza A RNA genome (14,000 nucleotides) Eight segments (Image: Tao and Zheng, Science 2012) S. Typhi CT18 DNA genome (~5,100,000 nucleotides) One chromosome + two plasmids Science (2001) VIRUS BACTERIUM
  • 8. 8 Outbreak investigation Similarities: place, time, genetics fda.gov 2014 2010-2013 Inns et al. (2015)
  • 9. Outbreak investigation in Canada 9 NATIONAL MICROBIOLOGY LABORATORY PROVINCIAL PUBLIC HEALTH LABORATORIES CLINICAL ISOLATES SENTINEL SURVEILLANCE (FoodNet Canada) CLINICAL, FOOD, ENVIRONMENTAL CANADIAN FOOD INSPECTION AGENCY (Regulatory) FOOD ISOLATES LISTERIA - E. COLI O157:H7 - SALMONELLA - SHIGELLA PFGE/MLVA PUBLIC HEALTH ACTION
  • 10. 10 Pulsed Field Gel Electrophoresis Serratia - NICU Jang et al., J Hosp Infect (2001)
  • 11. 11 15 gigabases per run $1000 - $1500 / run, 1 day Tinier pieces (150 – 400 bases) < 1 kilobase per run $2 / run, 1-3 hours (96 in parallel) Tiny pieces (600 – 1000 bases) 2011: Illumina MiSeq1977: Sanger sequencing ( ) DNA Sequencing
  • 13. MiSeq projects at Dalhousie • Bedford Basin microbial monitoring • Pediatric Crohn’s disease samples • Global microbial air sampling • Mink genomes • Sequencing Lactobacillus genomes from the poop of old mice • Wastewater diversity and function in the Arctic • Verifying ingredients in dog food ( ) • Exercise and the Microbiome 13
  • 14. Integrated Rapid Infectious Disease Analysis www.irida.ca 14  1.56M, 3-year Genome Canada Large-Scale Applied Platform Grant  SFU / BCCDC / PHAC-NML / Dalhousie  DNA sequencing and downstream applications • data management / federation • analysis workflows • ontologies • APIs • 3rd-party applications  Implementation in provincial public health labs  Training
  • 16. 16  Ontologies and data standards  NCBI, MiXS, vegetables  Metadata  Data provenance  Data quality  Environmental information
  • 17. Data sharing! • BIG challenges – different jurisdictions, “ownership” of epi data. Privacy! • Health service providers – concerns about privacy and data breach • Technology outstrips policy • What digital records could we get TODAY? • Canada lagging in data sharing 17
  • 18. 18  Calling isolates based on genetic variation  Traditional:  Pulsed-field  Multi-locus (standards! mlst.net)  Whole genomes:  Lots of information!  Too much information!  Lots of filtering and quality control required
  • 19. 19  Workflow management  REST-like API (3rd – party applications)  Security: authentication / authorization  Data models & implementation
  • 20. Local Storage Remote APIs IRIDA’s Federated Design List Samples 20
  • 21. 21  Each pipeline is implemented as a Galaxy workflow  Internal analysis pipelines  Assembly and annotation  Phylogenetics  “Line list” management  3rd-party applications
  • 22. 22 Sampled genomes Quality control Tree generation / visualization Single-Nucleotide Variant Phylogenetic Pipeline (SNVPhyl)
  • 23. 23 GenGIS Data from Haiti cholera outbreak, 2010 http://kiwi.cs.dal.ca/GenGIS
  • 25. 25  Interfaces / environment  Personas  Researchers  Epidemiologists  Clinical microbiologists / lab technicians  Workflow design and execution
  • 26. Full Privileges Cluster Line List ID Patient Name Prov. Health No. Age Sex Location Sample ID Collection Date Culture Result A 1 John Smith 4513253244 26 M Vancouver F14231 14/03/21 Salmonella sp. A 2 Sally Smith 4519567458 24 F Vancouver F14235 14/03/21 Salmonella sp. B 3 Tom Jones 4517543216 35 M Vancouver M6542 14/03/24 Salmonella sp. B 4 Helen Jones 9856321124 35 F Vancouver S1245 14/03/22 Salmonella sp. C 5 Jennifer Lee 4516853122 29 F Vancouver S5642 14/03/22 Salmonella sp. C 6 Michael Brown 9456534561 45 M Victoria T68954 14/03/25 Salmonella sp. Phylogenetic Tree Genetic Distance
  • 27. Limited Privileges Cluster Line List ID Patient Name Prov. Health No. Age Sex Location Sample ID Collection Date Culture Result A 1 John Smith 4513253244 26 M Vancouver F14231 14/03/21 Salmonella sp. A 2 Sally Smith 4519567458 24 F Vancouver F14235 14/03/21 Salmonella sp. B 3 Tom Jones 4517543216 35 M Vancouver M6542 14/03/24 Salmonella sp. B 4 Helen Jones 9856321124 35 F Vancouver S1245 14/03/22 Salmonella sp. C 5 Jennifer Lee 4516853122 29 F Vancouver S5642 14/03/22 Salmonella sp. C 6 Michael Brown 9456534561 45 M Victoria T68954 14/03/25 Salmonella sp. Phylogenetic Tree Genetic Distance
  • 30. Public Health England project (>10,000 Salmonella so far) • As of 2015, sequencing every sampled Salmonella isolate collected in England • Over 10,000 sequenced to date • 8000 already available for download in the public databases 30
  • 31. Gary van Domselaar, NML 31 The Global Microbial Identifier
  • 32. 32 What’s next? ??? per run $900 / run, 6 hours Huge pieces (max so far – 200-300 kilobases) Can stop / restart using same disposable flowcell 2015: Oxford Nanopore MinION 15 cm (-ish) thehightechsociety.com
  • 33. Quick et al. (2015) “Using a novel streaming phylogenetic placement method samples can be assigned to a serotype in 40 minutes and determined to be part of the outbreak in less than 2 h.” 33
  • 35. Example workflow 35 6 hrs Change flowcell Samples evaluated against reference in real time Positive ID / placement Load DNA     
  • 36. Challenges • Sample extraction: getting DNA from stuff • Clinical-grade evaluation • Training • Equipment reliability • Sequencing errors • Quality of reference data / attribution algorithms • Database updates in real time • Ethics / privacy (Genomes Sequenced While U Wait) 36
  • 37. The Point 37 Comprehensive monitoring Accurate typing Rapid identification Real-time decision making
  • 38. Acknowledgements PIs Fiona Brinkman – SFU Will Hsiao – PHMRL Gary Van Domselaar – NML Morag Graham - NML Rob Beiko – Dalhousie University of Lisbon Joᾶo Carriҫo National Microbiology Laboratory (NML) Franklin Bristow Aaron Petkau Thomas Matthews Josh Adam Adam Olsen Tara Lynch Shaun Tyler Philip Mabon Philip Au Celine Nadon Matthew Stuart-Edwards Chrystal Berry Lorelee Tschetter Laboratory for Foodborne Zoonoses (LFZ) Eduardo Taboada Peter Kruczkiewicz Chad Laing Vic Gannon Matthew Whiteside Ross Duncan Steven Mutschall Simon Fraser University (SFU) Melanie Courtot Emma Griffiths Geoff Winsor Julie Shay Matthew Laird Bhav Dhillon Raymond Lo BC Public Health Microbiology & Reference Laboratory (PHMRL) and BC Centre for Disease Control (BCCDC) Judy Isaac-Renton Patrick Tang Natalie Prystajecky Jennifer Gardy Damion Dooley Linda Hoang Kim MacDonald Yin Chang Eleni Galanis Marsha Taylor Cletus D’Souza Ana Paccagnella University of Maryland Lynn Schriml Canadian Food Inspection Agency (CFIA) Burton Blais Catherine Carrillo Dominic Lambert Dalhousie University Alex Keddy 38 McMaster University Andrew McArthur Daim Sardar European Nucleotide Archive Guy Cochrane Petra ten Hoopen Clara Amid European Food Safety Agency Leibana Criado Ernesto Vernazza Francesco Rizzi Valentina
  • 39. 39 Seminar from the Will Hsiao, BC Centres for Disease Control
  • 40. 40 Materials to be available on http://bioinformatics.ca/ June 24-26, 2015
  • 41. The Bioinformatics Exam of the Future 41 tagc.com.au commons.wikimedia.org/wiki/File:DNA_ahelatest_moodustunud_niit_katsuti_korgil..JPG http://omicfrontiers.com/2014/06/11/diaryofaminion_part2/
  • 42. 2009 was a long time ago 42 J. Craig Venter Institute
  • 43. 43Photo credit: Emma Allen-Vercoe Some slides courtesy of Gary Van Domselaar, NML

Hinweis der Redaktion

  1. The central issue facing bioinformaticians today can be summed up quite nicely with this graph charting the cost of generating biological sequencing data and the associated cost of computing this data. The white line at the top represents Moore’s law, which describes an observation of the long-term trend towards decreased computing cost over time. It’ named after Gordon Moore, a co-founder of Intel, who first described the trend over 50 years ago. It derives from the observation that the number of components that can be crammed into an integrated circuit, like a cpu, approximately doubles every year to a year and a half., which translates into the cost of computing decreasing by half over the same time period. The trend has held steady for 5 decades and is expected to continue at this rate for at least another 10 years. The cost of generating biological sequence data approximately followed this same trend, but the trend was upset by the introduction of next-generation sequencing technology near the end of 2005 followed by its widespread adoption in biotechnology over the following two years. From then on the rate of reduction in the cost of generating biological sequence data has fallen dramatically, to the point that today any microbiology lab can afford to routinely generate the sequences of the organisms that they study. The consequence of this drastic reduction in the cost and time to generate biological sequence data stands to revolutionize public health research and morag’ presentation provides some nice examples of this, so biologists look at this line and rejoice, but bioinformatics scientests look at the gap between this line and this line, and well, we panic.
  2. When a user goes to request the samples that are available for a project, that installation will query the local storage for what it has there, then also go out to the remote APIs and ask what they can provide. Those remote APIs will decide what the user has permission to from the request, and provide them back to the caller.