4. Introduction
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metagenome = Parts list of the community
Photo: D. Kunkel; color, E. Latypova
5. Introduction
”...functional analysis of the collective genomes of soil
microflora, which we term the metagenome of the soil.”
- J. Handelsman et al., 1998
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
7. Introduction
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
”...functional analysis of the collective genomes of soil
microflora, which we term the metagenome of the soil.”
- J. Handelsman et al., 1998
PubMed: metagenom*[Title/Abstract]
Sequencing costs
http://www.genome.gov/sequencingcosts/
11. What have metagenomics been used for?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Rusch et al., 2007 Plos Biology
Exploration
Qin et al., 2010 Nature
• 6.3 Gbp of sequence (2x Human genomes,
2000 x Bacterial genomes)
• Most sequences were novel compared to
the databases
• 127 Human gut metagenomes
• 600 Gbp sequence (200 x Human genomes)
• 3.3 million genes identified
• Minimal gut metagenome definded
12. What have metagenomics been used for?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
• A characteristic microbial fingerprint for
each of the nine different ecosystem types
Dinsdale et al., 2008 Nature
Comparative Specific functions
Hess et al., 2011 Science
• Identified 27.755 putative carbohydrate-active
genes from a cow rumen metagenome
• Expressed 90 candidates of which 57% had
enzymatic activity against cellulosic substrates
13. What have metagenomics been used for?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
• Genome extraction from low complexity
metagenome
• Candidatus Accumulibacter phosphatis
• The first genome of a polyphosphate
accumulating organism (PAO) with a major
role en enhanced biological phosphorus
removal
Extracting genomes
• Genome extraction of low abundant species
(< 0.1%) from metagenomes
• First complete TM7 genome
• Access to genomes of the ”uncultured
majority”
Garcia Martin et al., 2006 Nat. Biotechnol. Albertsen et al., 2013 Nat. Biotechnol.
21. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Pitfalls
Is your DNA extraction OK?
... and the samples you want to compare with?
Did you sequence enough?
Did you know the GC bias of your protocol?
Did you normalize for sequencing depth?
Did you use the same sequencing platform?
Assembly = data not quantitative!
Are you comparing assembled data with reads?
22. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Databases
Contigs
Databases
...you only see what is in the database
Annotated metagenome
23. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What is in the databases?
Phyla
Class
Order
Species
29
46
100
1268
90
249
405
99322
Genomes 16S
Finshed Genomes in IMG
Vs.
Greengenes 16S rRNA database
Note: only including 1 strain pr. species
*97% clustering
*
24. MG-RAST example
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Contigs
650.000 EBPR proteins with taxonomy assigned
How similar are they to the
genomes in the database?
25. Sludge microbes vs. Database genomes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
650.000 EBPR proteins
Note: not abundance weighted
26. Sludge microbes vs. Database genomes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
650.000 EBPR proteins
1.260.000 Human gut
Qin et al., 2010 Nature
RAST ID: 4448044.3
Note: not abundance weighted
27. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Sludge microbes vs. Database genomes
The 7 genera with most EBPR proteins assigned
28. Effect of missing genomes
What is the effect of not having closely related
genomes in the database?
1. Remove a genome from the database
2. Search the removed genome against the database
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
29. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Best hit
Bacteria 1268
Proteobacteria 564
Betaproteobacteria 84
Rhodocyclales 5
Rhodocyclaceae 5
Accumulibacter phosphatis
blastp
Related genomes
4326 proteins
30. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Best hit
Accumulibacter phosphatis
blastp
Related genomes
4326 proteins
Azoarcus
Bacteria 1268
Proteobacteria 564
Betaproteobacteria 84
Rhodocyclales 5
Rhodocyclaceae 5
31. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
MEGAN LCA
Accumulibacter phosphatis
blastp
Lowest common ancester (LCA) approach:
Hit 1: Beta-proteobacteria 80% ID
Hit 2: Gamma-proteobacteria 79% ID
Hit 3: Actinobacteria 59% ID
Assigned to Proteobacteria
Related genomes
4326 proteins
Bacteria 1268
Proteobacteria 564
Betaproteobacteria 84
Rhodocyclales 5
Rhodocyclaceae 5
32. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
MEGAN LCA
Accumulibacter phosphatis
blastp
Genus
No hits 261
Bacteria 325
Proteobacteria 860
Beta- 853
Rhodocyclaceae 1149
4326 proteins:
• 27% correctly
classified on
genus level
• 54% not
assigned the
correct class
• 101 genera
identified
Related genomes
Lowest common ancester (LCA) approach:
Hit 1: Beta-proteobacteria 80% ID
Hit 2: Gamma-proteobacteria 79% ID
Hit 3: Actinobacteria 59% ID
Assigned to Proteobacteria
4326 proteins
Bacteria 1268
Proteobacteria 564
Betaproteobacteria 84
Rhodocyclales 5
Rhodocyclaceae 5
33. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
MEGAN LCA
Nitrospira defluvii
Bacteria 1268
Nitrospirae 3
blastp
Related genomes
4268 proteins:
• 1% correctly
classified on
phylum level
Phylum
34. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
MEGAN LCA
+
KEGG
Nitrospira defluvii
blastp
Related genomes
Bacteria 1268
Nitrospirae 3
What about function?
35. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
MEGAN LCA
+
KEGG
Nitrospira defluvii
blastp
Related genomes
Bacteria 1268
Nitrospirae 3
36. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Effect of missing genomes
Nitrospira defluvii
blastp
Related genomes
MEGAN LCA
+
KEGG
Bacteria 1268
Nitrospirae 3
37. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Implication of missing genomes
Function A
Function B
Function C
Function D
40. Potentials
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
1. Hunting novel antibiotic resistance genes
2. Extracting genomes from metagenomes
41. Hunting novel antibiotic resistance genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
What if you want to find
something that is not in the
database?
42. Hunting novel antibiotic resistance genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Functional metagenomics
M. Sommer, DTU, Denmark (in prep)
43. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Hunting novel antibiotic resistance genes
89 different antibiotic
resistance genes
19 novel
M. Sommer, DTU, Denmark (in prep)
44. Hunting novel antibiotic resistance genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
How abundant are the
antibiotic genes in the
environment?
45. Hunting novel antibiotic resistance genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
The number of
metagenome reads
reflect the abundance of
the bacteria.
Bacteria Reads
46. Hunting novel antibiotic resistance genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Bacteria Reads
47. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Hunting novel antibiotic resistance genes
Bacteria Reads
48. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Hunting novel antibiotic resistance genes
Metagenomes
Antibioticgenes
89 different antibiotic
resistance genes
M. Sommer, DTU, Denmark (in prep)
50. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
≈3.000.000 bp
pr. genome
≈1000 bp+
contigs
150 bp reads
Why not full
genomes?
Extracting genomes
51. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
≈3.000.000 bp
pr. genome
≈1000 bp+
contigs
150 bp reads
Why not full
genomes?
1. Micro-diversity
2. Separation of genomes (Binning)
Extracting genomes
52. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Not 1 strain
Many closely related strains
AAAAAAAAAAAAAA
AAAAAAAAATAAAA
AAAAAAAAACAAAA
AAAAAAAAA
TAAAA
CAAAA
What you get
AAAAA
Assembly
Extracting genomes
53. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Extracting genomes
Metagenome assembly
is not quantitative!
54. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Reduce microdiversity
Low micro-diversityHigh micro-diversity
Short term
enrichment
55. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
≈3.000.000 bp
pr. genome
≈1000 bp+
contigs
150 bp reads
Why not full
genomes?
1. Micro-diversity
2. Separation of genomes (Binning)
Extracting genomes
56. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
Genomic signatures:
- GC / Codon usage
- Tetranucleotide frequency + statistical method
Complex sample
PhD student
”Binning”
57. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
Genomic signatures:
- GC / Codon usage
- Tetranucleotide frequency + statistical method
Complex sample
PhD student
”Binning”
Problems:
- Short pieces of sequence (1-10kbp)
- Local sequence divergence
60. 1. Reduce micro-diversity
2. Use multiple related samples
Abundance Sample 1
AbundanceSample2
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
61. 1. Reduce micro-diversity
2. Use multiple related samples
Abundance Sample 1
AbundanceSample2
Abundance Sample 1
AbundanceSample2
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Binning
62. Simple reactors
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYH. Daims & C. Dorninger, DOME, University of Vienna
• Nitrospira enrichment
running for years
• 3 dominant species
• No micro-diversity
63. Short term
enrichment
Full-scale EBPR plant
SBR reactor
Days 1. Reduction of (micro)-diversity
Competibacter
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
64. Short term
enrichment
Full-scale EBPR plant
SBR reactor
2. Two
different
DNA
extraction
methods
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
65. Colored using a set of 100 phylogenetic marker genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
66. Colored using a set of 100 phylogenetic marker genes
TM7-1 (1.6%)
TM7-2 (0.7%)
TM7-3 (0.2%)
TM7-4 (0.06%)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
67. Zoom on target
TM7-2 (0.7%)
Colored using a set of 100 phylogenetic marker genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
68. Zoom on target
PC2
PC1
TM7-2
PCA on genomic
signatures
TM7-2 (0.7%)
Colored using a set of 100 phylogenetic marker genes
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
69. Colored using a set of 100 phylogenetic marker genes
TM7-1 (1.6%)
Candidate phylum TM7
Saccharibacteria
Candidatus Saccharimonas aalborgensis
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
71. Genome assembly validation
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
Phyla
Genes (HMM model)
Essential single copy genesAssembly inspection
72. Multi-metagenome
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
http://madsalbertsen.github.io/multi-metagenome/
Short: goo.gl/0ctA3
• Guides
• Workflow scripts
• Example data
• All the code
• Reccomendations
74. Potentials
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Metabolites
Proteins
mRNA
DNA
Meta-bolomics
Meta-proteomics
Meta-transcriptomics
Meta-genomics
In Situ methods
Community structure Microbial functions
Extraction
P-Removal:
N-Removal:
-Removal:
Foaming:
Ethanol production:
Microbial needs
75. Recommendations
• Do you really need metagenomics?
• Are the databases usefull in your environment?
• Unless human related they are not...
• Metagenomics is just the parts list
... of the DNA that could be extracted
... and the functions that could be annotated
• Validation, validation validation!
• Bioinformatic
• In situ
• Genome extraction from simple reactors is possible
• Enables comprehensive transcriptomics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY