Eccmid meet the expert 2015

JoãoAndré Carriço
jcarrico@fm.ul.pt
Twitter: @jacarrico

 MicrobialTyping : discriminating strains bellow
species/subspecies level
 Genomics : antibiotic resistance/ virulence factor gene
presence/absence, Mobile genetic elements detection

http://en.wikipedia.org/wiki/File:ChronicleOfADeathForetold.JPG
 WGS in molecular typing:
 Gene-by-gene: wgMLST,
cgMLST,rMLST,MLST,eMLST,
MLST+
 SNP comparison approaches:
comparison with reference strains
 Ability to recover most of the
present sequence based typing
information in a single
experimental procedure

Microbiological
Sample
The Ideal Scenario
Magic Box of
NGS Wonders for
Microbiology
Completely characterized strain:
• Antibiotic resistance profile
• Multilocus SequenceTyping (MLST)
• Virulence factors present
• Other SBTM information .Ex:
• spa (S. aureus)
• emm (Group A Streptococcus)
Desired End result:
Risk Assessment of the strain and
Useful application of the data to clinical practice
Comparison between groups of strains

https://pmcvariety.files.wordpress.com/2014/06/eli-wallach-dead-good-bad-ugly.jpg?w=670&h=377&crop=1

My Goals/ Areas that I want to apply WGS to:
• Microbial population structure
• Microbial Evolution
• Microbial Genomics : gene structure, genome synteny,
Mobile Genetic Elements detection
My toolbox is chosen based on my questions and what I want to do !
Trying to avoid:
“I suppose it is tempting, if the only tool you have is a hammer, to treat
everything as if it were a nail.” - Abraham H. Maslow (1962),Toward a
Psychology of Being

Sequence QA/QC
FastQC
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Adaptor and Quality trimming:
trimmomatic
http://www.usadellab.org/cms/?page=trimmomatic
Assembly
SPAdes
http://bioinf.spbau.ru/spades
Velvet
http://www.ebi.ac.uk/~zerbino/velvet/
Mapping
Bowtie2
http://bowtie-
bio.sourceforge.net/bowtie2/index.shtml
Annotation:
Prokka
http://www.vicbioinformatics.com/softw
are.prokka.shtml
Whole genome comparison
BRIG (Blast Ring Generator)
http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
MAUVE
http://darlinglab.org/mauve/mauve.html

http://rugbyea.com/wp-content/uploads/2013/05/blast.jpg
http://www.ecohealthypets.com/writable/pet_report_photos/photo/480x/ball_pyth
on_2.jpg

- Perform the same analysis over tens, hundreds or thousands
of strains : your own and publicly available
- Integrate multiple analysis in a single pipeline
- Pipelines = reproducibility (if not something is very wrong)
http://www.ebi.ac.uk/ena
http://www.ncbi.nlm.nih.gov/sra

 Gene-by-Gene /extended MLST approaches
are my favorite
 Why?
 Allele based classification “buffers” the effect of
recombination in the analysis
 Stable nomenclature for alleles facilitates data
exchange by schema creation
 Easy to expand and visualize up to thousands of
genomes with MST- like approaches
 Lower computing requirements

 Bacterial Isolate Genome Sequence Database
 Jolley & Maiden 2010, BMC Bioinformatics 11:595 -
http://pubmlst.org/software/database/bigsdb/
 PROs: Freely available, open-source, handles thousands of genomes, has
several schemas implemented for MLSTfor several bacterial species, and
some extended MLST and core genome MLST (mainly Neisseria sp. but
soon to be expanded)
 CONs: Requires Perl knowledge to install and maintain
 Ridom SeqSphere+
 http://www.ridom.com/seqsphere/
 Commercial software with client server solutions from assembly to allele
calling and visualization for core genome MLST (MLST+/ cgMLST)
 Applied Maths - Bionumerics 7.5
 http://www.applied-maths.com/news/bionumerics-version-75-released
 Commercial software with client server solutions from assembly to allele
calling and visualization for whole genome MLST (wgMLST)

Schema = set of loci to be used
What is a locus?
gene or part of a gene
How to choose the locus:
1. Start from reference genomes
2. Decide if you want core genes only or core+accessory genes
3. Use a method to compare CDS/ORF of reference genomes:
1. OrthoMCL - www.orthomcl.org
2. CD-HIT-cd-hit.org
4. Parse the output to:
1. Remove paralogous genes
2. Decide which are core genes and which are accessory genes

At this point different algorithms/software use:
- BLAST(n/p/x)
- Different criteria and parameters are used to call an
alleles as a coding sequence or part of a coding sequence

Self BLAST
– Calculate BSR
BLAST
Run prodigal
on genome
Translate CDS
to protein
Translate gene
file to protein
Gene BLAST
database
No blast match
or BSR<=0.6
BSR =1 &
same DNA seq?
LOT? BSR>0.6
Add new allele
to gene file
Calculate BSR
of the new allele
Calculate BSR
Re-do
Gene BLAST
database
LNF Exact Match LOT
Inferred
Allele
Allelic profile
Prodigal (Prokaryotic Dynamic Programming Gene findingAlgorithm):
BSR: Blast Score Ratio
LOT: Locus On theTip (of a contig)

Core Genome addressing synteny:

Core Genome Addressing synteny and paralogy:

http://www.phyloviz.net
Open source and Freely available!

Can be easily applied to:
- MLST
- MLVA
- SNP data*
- Gene Presence/absence
*Conversion ofVCF to PHYLOViZ:
https://github.com/nickloman/misc-genomics-tools/blob/master/scripts/vcf2phyloviz.py
(Thanks Nick!)

PROs:
Handles thousands of profiles
Fast calculation
Easy to annotate and explore metadata
Allows for basic statistics on profiles and metadata
Allows for advanced statistics on MSTs
(PLoS One. 2015 Mar 23;10(3):e0119315)
Exports high quality graphical formats
Allows plugin development
CONs:
goeBURST and goeBURST MST only
(Neighbour Joining and UPGMA soon)
JAVA knowledge to code new plugins

 MEGA (http://www.megasoftware.net/)
 Splitstree (http://www.splitstree.org/)
 Geneious (http://www.geneious.com/)
 Multipurpose software: very useful for sequence alignment
visualization, tree building and annotation visualization
(commercial software)

 No need to take sides on choosing an approach. Gene-by-
gene, SNP, K-mer methods should be used depending on the
problem at hand and the questions
 The still evolving tool and sequencing methodology
development makes the creation of easy-to-use “big red
button” approaches difficult to implement
 Beware of differences in software /algorithm version that
can lead to different results
 Always be critical for the results you have and try to
understand if you have a nail or a screw before picking up the
hammer at hand

 UMMI Members:
 Mickael Silva
 Sergio Santos
 Bruno Gonçalves
 Adriana Policarpo
 Mário Ramirez
 José Melo-Cristino
 FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):
 Dag Harmsen (Univ. Muenster)
 Stefan Niemann (Research Center Borstel)
 Keith Jolley, James Bray and Martin Maiden (Univ. Oxford)
 Joerg Rothganger (RIDOM)
 Hannes Pouseele (Applied Maths)
 Genome Canada IRIDA project (www.irida.ca)
 Franklin Bristow,Thomas Matthews, Aaron Petkau, Morag Graham and GaryVan Domselaar (NLM ,
PHAC)
 EdTaboada and Peter Kruczkiewicz (Lab Foodborne Zoonoses, PHAC)
 Fiona Brinkman (SFU)
 William Hsiao (BCCDC)
INESC-ID Members:
Alexandre Francisco
CátiaVaz
PedroTiago Monteiro
INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS
Twitter Microbial Bioinf community:
Nick Loman
Torsteen Seeman
Will Schaik
MickWatson
Jennifer Gardy
Many, many others….

Draft Scientific Programme:
Plenaries:
1) Small Scale Microbial Epidemiology
2) Large Scale Microbial Epidemiology
3) Bioinformatics for Genome-based Microbial Epidemiology
4) Population Genetics: Pathogen Emergence
5) Population Dynamics : Transmission networks and
surveillance
6) Molecular Epidemiology for Global Health and One
Health
Parallel Sessions
1) Food and Environmental pathogens
2) Microbial Forensics
3) Virus
4) Fungi and Yeasts
5) Novel Diagnostics methodologies
6) Novel Typing approaches
7) Phylogenetic Inference
8) Interactive Illustration Platforms

Eccmid meet the expert 2015

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Eccmid meet the expert 2015

Ähnlich wie Eccmid meet the expert 2015 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Eccmid meet the expert 2015