SlideShare ist ein Scribd-Unternehmen logo
1 von 52
When trees can’t agree 
Robert Beiko
- The human microbiome - 
an ecosystem unlike any other 
2
Human gut microbiome: 2-3 million genes 
Typically > 160 “species” at any given time 
Qin et al., Nature (2010)
Microbial communities 
http://upload.wikimedia.org/wikipedia/commons/2/2d/Bacteria_%28251_31%29_Airborne_microbes.jpg 
4
5 
Photo courtesy of Emma Allen-Vercoe, 
University of Guelph
Lachno 
Lachnospiraceae – commonly thought of as “Good bacteria” 
Meehan and Beiko (2014) Genome Biol Evol 
6
Sizes of Assembly and Draft Genomes of Class Clostridia 
0 1000 2000 3000 4000 5000 6000 7000 8000 
Number of Protein-Coding Genes 
Zilla 
7
50 
33 
? 
4 
9
W. Ford Doolittle, Sci Am (1999) 10
PNAS, 2012 
Gene transfer matters 
“…pathogen-driven inflammatory responses in the gut can generate transient enterobacterial 
blooms in which conjugative transfer occurs at unprecedented rates.” 
PLoS Biol, 2007 
“…lateral gene transfer, mobile elements, and gene amplification have played important roles in 
affecting the ability of gut-dwelling Bacteroidetes to vary their cell surface, sense their 
environment, and harvest nutrient resources present in the distal intestine.” 
11
The genomics toolkit 
Gene profiles 
12 
Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
…
The genomics toolkit 
“Species” trees 
13
The genomics toolkit 
Gene trees 
Do this for 
ALL genes 
14
Representing and understanding 
microbial relationships 
1. Matrix-based approaches 
2. Phylogenetic reconciliation 
3. Gene distributions and “microbial identity” 
15
The tyranny 
of distance
From profile to distance matrix 
 
 
 
 
 
 
 
 
17 
Gene 1 Gene 2 Gene 3 Gene 4 Gene n 
A 
B 
 
 
 
 
 
 
  
 
 
C 
D 
E 
F 
S1 = 0.91 0.82 0.72 0.89 
푑퐴,퐵 = 1.0 − 
1 
푛 
푛 
푔=1 
푆푔 A B C 
A 0 0.165 0.252 
B 0.165 0 0.297 
C 0.252 0.297 0
Neighbor-joining 
18 
Start with a ‘star’ tree 
At each iteration, split off the pair of taxa that minimizes the total sum 
of branch lengths in the tree 
Choose groups x and y to minimize the Q-criterion: 
Distance matrix entry for (x,y) 
x 
y 
Weighted distance to all leaves
19 
Continue until binary tree is obtained 
Saitou and Nei (1987)
20 
Neighbor-net: Building a splits graph 
Bryant and Moulton, Mol Biol Evol (2003)
21 
Neighbor-net is guaranteed to produce a circular set 
of splits 
This will produce a planar graph
Neighbor-net of 298 microbial 
genomes 
Beiko, Biol Direct (2011) 22
Limitations of neighbor-net 
• Neighbor-net still imposes a constraint on the 
relationships among genomes: “long-distance” 
connections cannot be shown 
23 
?
Explicit connections between 
genomes 
• Make each genome a vertex in a graph G 
V = {A,B,C,D,E,F,…} 
E = {{A,B},…} 
For some threshold t: 
{A,B} ϵ G iff dA,B ≤ t 
or if some other condition is satisfied 
24 
A B 
wA,B
Linear programming 
•Weighting networks based on straight 
genome-genome similarity highlights 
close relatives, redundancy 
• LP introduces weighting scheme that 
constrains connections and promotes 
distinct relationships 
25
P. aeruginosa 
P. fluorescens 
P. lePewtida 
P. syringae 
P. entomophila 
P. stutzeri 
P. mendocina 
“Plume” 
Holloway and Beiko, BMC Evol Biol (2010) 
26
27 
Some like it hot 
Pyrococcus furiosus 
optimal growth temperature: 
100°C
Networks 
Kunin et al. (2005) Genome Res 28
Networks!!!! 
Dagan et al. (2008) PNAS 29
Inferring and 
comparing trees
Phylogenetic tree reconciliation 
31 
Species tree S Lateral gene transfer Gene tree G 
Subtree prune and regraft 
Whidden et al., Syst Biol (2014)
32 
For two rooted trees, dSPR is equal to the 
number of components in a MAF, minus 1 
So building a MAF is equivalent to inferring the minimum 
number of SPR events needed to reconcile a species tree 
with a gene tree 
Problem is NP-hard 
dSPR = 1 
MAF components = 2 
Bordewich and Semple, Ann Combinatorics (2005)
33 
T1 T2 
Case 1 
(separate components) 
Case 3 
(several pendant nodes) 
Case 2 
(one pendant node) 
Chris’s algorithm
Fixed-parameter tractability 
• Problem is dominated by Case 3 (3 alternatives) 
• Cut all candidate edges at each step = linear 3-approximation 
• Decision problem: 푂 2.42푘푛 to decide if SPR distance ≤ k 
• Problem is exponential in SPR distance, NOT number of leaves 
therefore FPT 
Chris Whidden + Norbert Zeh 34
In practice 
35
SPR Supertrees 
Supertree: a tree that satisfies some optimality 
criterion with respect to a set of input trees 
SPR supertree: given a set of gene trees, find a tree 
that minimizes the total number of SPR operations vs. all 
gene trees 
Building an SPR supertree: assemble an initial tree, 
then propose SPR operations and evaluate its total SPR 
distance from input trees 
Whidden et al., 2014 36
Why SPR supertrees? 
1. Explicit representation of LGT events 
2. Branches broken in MAF → implied 
LGT events. Can build graph of 
connections 
37
244 bacterial genomes 
40,631 gene trees 
= Bacterial SPR supertree 
LGT patterns for Clostridium 
Whidden et al., 2014
Taming Lachnozilla 
(taming in progress) http://en.wikipedia.org/wiki/File:Godzilla_%2754_design.jpg
What makes 
LachnoZilla 
LachnoZilla ?
Phylogenetic profile based 
on extremely good matches to 
other genomes 
(> 95% ID, > 95% coverage) 
= “recent” LGT events 
C. difficile 
…. 
“Virulence-associated protein” 
Mobile DNA 
41
279 genomes 
Conserved marker-gene tree 
LZ & friends 
Ben Wright 
42
LachnoZilla (and friends) 
genome graph 
! 
43
Close 
relative 
(expected) 
44
Distant relative 
(not so expected) 
(big genome though!) 
45
Selective 
sharing 
46
Gene-centric graphs 
LZ Genome 
1 
Genome 
2 
Genome 
3 
Genome 
4 
Genome 
5 
Genome 
6 
Gene 1     × × 
Gene 2     ×  
Gene 3    × ×  
Gene 4 × × ×    
Gene 2 
Gene 3 
Gene 1 
Gene 4 
Edge weights are proportional to similarity of distribution 
Use graph clustering to divide up completely connected, weighted graph
Lachnozilla in graph form 
(it all makes sense now) 
Legionaminic acid 
Acetylneuraminic acid 
(pathogen associated) 
Bacteroides pectinophilus 
Butyrivibrio proteoclasticus 
Eubacterium plexicaudatum 
Roseburia 
Neighbors 
Weirdly named isolates
Mystery isolate #1 
(made-up example)
Mystery isolate #2 
(made-up example)
Representations 
Clear inference 
From pattern to understanding 
uestions
FIN 
52

Weitere ähnliche Inhalte

Was ist angesagt?

UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
 
Biology - Chp 13 - Genetic Engineering - PowerPoint
Biology - Chp 13 - Genetic Engineering - PowerPointBiology - Chp 13 - Genetic Engineering - PowerPoint
Biology - Chp 13 - Genetic Engineering - PowerPointMr. Walajtys
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Leighton Pritchard
 
Comparative genomics presentation
Comparative genomics presentationComparative genomics presentation
Comparative genomics presentationEmmanuel Aguon
 
Macromolecule evolution
Macromolecule  evolutionMacromolecule  evolution
Macromolecule evolutionPaula Mills
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedJonathan Eisen
 
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...GigaScience, BGI Hong Kong
 
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...GigaScience, BGI Hong Kong
 
Genetic engineering
Genetic engineering Genetic engineering
Genetic engineering Snehal Jadav
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAmol Kunde
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Jonathan Eisen
 
Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human HealthLarry Smarr
 
Genetic Engineering and the future of Evolutiom
Genetic Engineering and the future of EvolutiomGenetic Engineering and the future of Evolutiom
Genetic Engineering and the future of EvolutiomRicha Khatiwada
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...Juan Barrera
 
10 genetics & evolution syllabus statements
10 genetics & evolution syllabus statements10 genetics & evolution syllabus statements
10 genetics & evolution syllabus statementscartlidge
 

Was ist angesagt? (19)

UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomics
 
Biology - Chp 13 - Genetic Engineering - PowerPoint
Biology - Chp 13 - Genetic Engineering - PowerPointBiology - Chp 13 - Genetic Engineering - PowerPoint
Biology - Chp 13 - Genetic Engineering - PowerPoint
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1
 
Comparative genomics presentation
Comparative genomics presentationComparative genomics presentation
Comparative genomics presentation
 
Cloning & Genetic Engineering
Cloning & Genetic EngineeringCloning & Genetic Engineering
Cloning & Genetic Engineering
 
Metagenomic
MetagenomicMetagenomic
Metagenomic
 
Genomic variation
Genomic variationGenomic variation
Genomic variation
 
Macromolecule evolution
Macromolecule  evolutionMacromolecule  evolution
Macromolecule evolution
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...
Tom Delmont: From the Terragenome Project to Global Metagenomic Comparisons: ...
 
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...
Jack Gilbert: Welcome to the 1st International EMP Meeting: the first 10,000 ...
 
Genetic engineering
Genetic engineering Genetic engineering
Genetic engineering
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
 
Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human Health
 
Genetic engineering
Genetic engineering Genetic engineering
Genetic engineering
 
Genetic Engineering and the future of Evolutiom
Genetic Engineering and the future of EvolutiomGenetic Engineering and the future of Evolutiom
Genetic Engineering and the future of Evolutiom
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
10 genetics & evolution syllabus statements
10 genetics & evolution syllabus statements10 genetics & evolution syllabus statements
10 genetics & evolution syllabus statements
 

Ähnlich wie Microbiome Genomics Reveals Complex Relationships Between Gut Bacteria

Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_finalbeiko
 
A statistical physics approach to system biology
A statistical physics approach to system biologyA statistical physics approach to system biology
A statistical physics approach to system biologySamir Suweis
 
Networks, plant health and biodiversity
Networks, plant health and biodiversityNetworks, plant health and biodiversity
Networks, plant health and biodiversityMarco Pautasso
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingJonathan Eisen
 
Effects of density on spacing patterns and habitat associations of a Neotropi...
Effects of density on spacing patterns and habitat associations of a Neotropi...Effects of density on spacing patterns and habitat associations of a Neotropi...
Effects of density on spacing patterns and habitat associations of a Neotropi...Nicole Angeli
 
Genetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyGenetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyDeepak Kumar
 
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...c.titus.brown
 
Microbial Agrogenomics 4/2/2015, UK-MX Workshop
Microbial Agrogenomics 4/2/2015, UK-MX WorkshopMicrobial Agrogenomics 4/2/2015, UK-MX Workshop
Microbial Agrogenomics 4/2/2015, UK-MX WorkshopLeighton Pritchard
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
Species, people and networks
Species, people and networksSpecies, people and networks
Species, people and networksMarco Pautasso
 
Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...tuxette
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Karen Cranston
 
The use of networks in the study of climate-related vulnerabilities
The use of networks in the study of climate-related vulnerabilitiesThe use of networks in the study of climate-related vulnerabilities
The use of networks in the study of climate-related vulnerabilitiesMarco Pautasso
 
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...World Agroforestry (ICRAF)
 

Ähnlich wie Microbiome Genomics Reveals Complex Relationships Between Gut Bacteria (20)

Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_final
 
bai2
bai2bai2
bai2
 
Pathogen Genome Data
Pathogen Genome DataPathogen Genome Data
Pathogen Genome Data
 
A statistical physics approach to system biology
A statistical physics approach to system biologyA statistical physics approach to system biology
A statistical physics approach to system biology
 
Networks, plant health and biodiversity
Networks, plant health and biodiversityNetworks, plant health and biodiversity
Networks, plant health and biodiversity
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
Effects of density on spacing patterns and habitat associations of a Neotropi...
Effects of density on spacing patterns and habitat associations of a Neotropi...Effects of density on spacing patterns and habitat associations of a Neotropi...
Effects of density on spacing patterns and habitat associations of a Neotropi...
 
Genetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacologyGenetic variation and its role in health pharmacology
Genetic variation and its role in health pharmacology
 
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
 
Microbial Agrogenomics 4/2/2015, UK-MX Workshop
Microbial Agrogenomics 4/2/2015, UK-MX WorkshopMicrobial Agrogenomics 4/2/2015, UK-MX Workshop
Microbial Agrogenomics 4/2/2015, UK-MX Workshop
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
Species, people and networks
Species, people and networksSpecies, people and networks
Species, people and networks
 
Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...Combining co-expression and co-location for gene network inference in porcine...
Combining co-expression and co-location for gene network inference in porcine...
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
 
The use of networks in the study of climate-related vulnerabilities
The use of networks in the study of climate-related vulnerabilitiesThe use of networks in the study of climate-related vulnerabilities
The use of networks in the study of climate-related vulnerabilities
 
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...
Genotype to phenotype forest tree genomics: genome sequencing (de novo and re...
 
2014 davis-talk
2014 davis-talk2014 davis-talk
2014 davis-talk
 

Mehr von beiko

ASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptxASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptxbeiko
 
Beiko cmo gen_epi_monday
Beiko cmo gen_epi_mondayBeiko cmo gen_epi_monday
Beiko cmo gen_epi_mondaybeiko
 
Biomedical data
Biomedical dataBiomedical data
Biomedical databeiko
 
Rob csm2018
Rob csm2018Rob csm2018
Rob csm2018beiko
 
Beiko taconic-nov3
Beiko taconic-nov3Beiko taconic-nov3
Beiko taconic-nov3beiko
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beikobeiko
 
GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016beiko
 
Beiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentationBeiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentationbeiko
 
DCSI presentation 2015
DCSI presentation 2015DCSI presentation 2015
DCSI presentation 2015beiko
 
2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big databeiko
 
Beiko hpcs
Beiko hpcsBeiko hpcs
Beiko hpcsbeiko
 
Beiko biogeography
Beiko biogeographyBeiko biogeography
Beiko biogeographybeiko
 
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"beiko
 
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)beiko
 
Beiko smbe2013-final
Beiko smbe2013-finalBeiko smbe2013-final
Beiko smbe2013-finalbeiko
 
Rob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentationRob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentationbeiko
 
Beiko gen gis2-share
Beiko gen gis2-shareBeiko gen gis2-share
Beiko gen gis2-sharebeiko
 

Mehr von beiko (17)

ASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptxASMNGS_ARETE_Beiko_2022Oct19.pptx
ASMNGS_ARETE_Beiko_2022Oct19.pptx
 
Beiko cmo gen_epi_monday
Beiko cmo gen_epi_mondayBeiko cmo gen_epi_monday
Beiko cmo gen_epi_monday
 
Biomedical data
Biomedical dataBiomedical data
Biomedical data
 
Rob csm2018
Rob csm2018Rob csm2018
Rob csm2018
 
Beiko taconic-nov3
Beiko taconic-nov3Beiko taconic-nov3
Beiko taconic-nov3
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beiko
 
GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016GenGIS presentation at Vizbi 2016
GenGIS presentation at Vizbi 2016
 
Beiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentationBeiko ANL Soil Metagenomics presentation
Beiko ANL Soil Metagenomics presentation
 
DCSI presentation 2015
DCSI presentation 2015DCSI presentation 2015
DCSI presentation 2015
 
2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data
 
Beiko hpcs
Beiko hpcsBeiko hpcs
Beiko hpcs
 
Beiko biogeography
Beiko biogeographyBeiko biogeography
Beiko biogeography
 
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
 
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
 
Beiko smbe2013-final
Beiko smbe2013-finalBeiko smbe2013-final
Beiko smbe2013-final
 
Rob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentationRob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentation
 
Beiko gen gis2-share
Beiko gen gis2-shareBeiko gen gis2-share
Beiko gen gis2-share
 

Kürzlich hochgeladen

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Kürzlich hochgeladen (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Microbiome Genomics Reveals Complex Relationships Between Gut Bacteria

  • 1. When trees can’t agree Robert Beiko
  • 2. - The human microbiome - an ecosystem unlike any other 2
  • 3. Human gut microbiome: 2-3 million genes Typically > 160 “species” at any given time Qin et al., Nature (2010)
  • 5. 5 Photo courtesy of Emma Allen-Vercoe, University of Guelph
  • 6. Lachno Lachnospiraceae – commonly thought of as “Good bacteria” Meehan and Beiko (2014) Genome Biol Evol 6
  • 7. Sizes of Assembly and Draft Genomes of Class Clostridia 0 1000 2000 3000 4000 5000 6000 7000 8000 Number of Protein-Coding Genes Zilla 7
  • 8.
  • 9. 50 33 ? 4 9
  • 10. W. Ford Doolittle, Sci Am (1999) 10
  • 11. PNAS, 2012 Gene transfer matters “…pathogen-driven inflammatory responses in the gut can generate transient enterobacterial blooms in which conjugative transfer occurs at unprecedented rates.” PLoS Biol, 2007 “…lateral gene transfer, mobile elements, and gene amplification have played important roles in affecting the ability of gut-dwelling Bacteroidetes to vary their cell surface, sense their environment, and harvest nutrient resources present in the distal intestine.” 11
  • 12. The genomics toolkit Gene profiles 12 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5                   …
  • 13. The genomics toolkit “Species” trees 13
  • 14. The genomics toolkit Gene trees Do this for ALL genes 14
  • 15. Representing and understanding microbial relationships 1. Matrix-based approaches 2. Phylogenetic reconciliation 3. Gene distributions and “microbial identity” 15
  • 16. The tyranny of distance
  • 17. From profile to distance matrix         17 Gene 1 Gene 2 Gene 3 Gene 4 Gene n A B           C D E F S1 = 0.91 0.82 0.72 0.89 푑퐴,퐵 = 1.0 − 1 푛 푛 푔=1 푆푔 A B C A 0 0.165 0.252 B 0.165 0 0.297 C 0.252 0.297 0
  • 18. Neighbor-joining 18 Start with a ‘star’ tree At each iteration, split off the pair of taxa that minimizes the total sum of branch lengths in the tree Choose groups x and y to minimize the Q-criterion: Distance matrix entry for (x,y) x y Weighted distance to all leaves
  • 19. 19 Continue until binary tree is obtained Saitou and Nei (1987)
  • 20. 20 Neighbor-net: Building a splits graph Bryant and Moulton, Mol Biol Evol (2003)
  • 21. 21 Neighbor-net is guaranteed to produce a circular set of splits This will produce a planar graph
  • 22. Neighbor-net of 298 microbial genomes Beiko, Biol Direct (2011) 22
  • 23. Limitations of neighbor-net • Neighbor-net still imposes a constraint on the relationships among genomes: “long-distance” connections cannot be shown 23 ?
  • 24. Explicit connections between genomes • Make each genome a vertex in a graph G V = {A,B,C,D,E,F,…} E = {{A,B},…} For some threshold t: {A,B} ϵ G iff dA,B ≤ t or if some other condition is satisfied 24 A B wA,B
  • 25. Linear programming •Weighting networks based on straight genome-genome similarity highlights close relatives, redundancy • LP introduces weighting scheme that constrains connections and promotes distinct relationships 25
  • 26. P. aeruginosa P. fluorescens P. lePewtida P. syringae P. entomophila P. stutzeri P. mendocina “Plume” Holloway and Beiko, BMC Evol Biol (2010) 26
  • 27. 27 Some like it hot Pyrococcus furiosus optimal growth temperature: 100°C
  • 28. Networks Kunin et al. (2005) Genome Res 28
  • 29. Networks!!!! Dagan et al. (2008) PNAS 29
  • 31. Phylogenetic tree reconciliation 31 Species tree S Lateral gene transfer Gene tree G Subtree prune and regraft Whidden et al., Syst Biol (2014)
  • 32. 32 For two rooted trees, dSPR is equal to the number of components in a MAF, minus 1 So building a MAF is equivalent to inferring the minimum number of SPR events needed to reconcile a species tree with a gene tree Problem is NP-hard dSPR = 1 MAF components = 2 Bordewich and Semple, Ann Combinatorics (2005)
  • 33. 33 T1 T2 Case 1 (separate components) Case 3 (several pendant nodes) Case 2 (one pendant node) Chris’s algorithm
  • 34. Fixed-parameter tractability • Problem is dominated by Case 3 (3 alternatives) • Cut all candidate edges at each step = linear 3-approximation • Decision problem: 푂 2.42푘푛 to decide if SPR distance ≤ k • Problem is exponential in SPR distance, NOT number of leaves therefore FPT Chris Whidden + Norbert Zeh 34
  • 36. SPR Supertrees Supertree: a tree that satisfies some optimality criterion with respect to a set of input trees SPR supertree: given a set of gene trees, find a tree that minimizes the total number of SPR operations vs. all gene trees Building an SPR supertree: assemble an initial tree, then propose SPR operations and evaluate its total SPR distance from input trees Whidden et al., 2014 36
  • 37. Why SPR supertrees? 1. Explicit representation of LGT events 2. Branches broken in MAF → implied LGT events. Can build graph of connections 37
  • 38. 244 bacterial genomes 40,631 gene trees = Bacterial SPR supertree LGT patterns for Clostridium Whidden et al., 2014
  • 39. Taming Lachnozilla (taming in progress) http://en.wikipedia.org/wiki/File:Godzilla_%2754_design.jpg
  • 40. What makes LachnoZilla LachnoZilla ?
  • 41. Phylogenetic profile based on extremely good matches to other genomes (> 95% ID, > 95% coverage) = “recent” LGT events C. difficile …. “Virulence-associated protein” Mobile DNA 41
  • 42. 279 genomes Conserved marker-gene tree LZ & friends Ben Wright 42
  • 43. LachnoZilla (and friends) genome graph ! 43
  • 45. Distant relative (not so expected) (big genome though!) 45
  • 47. Gene-centric graphs LZ Genome 1 Genome 2 Genome 3 Genome 4 Genome 5 Genome 6 Gene 1     × × Gene 2     ×  Gene 3    × ×  Gene 4 × × ×    Gene 2 Gene 3 Gene 1 Gene 4 Edge weights are proportional to similarity of distribution Use graph clustering to divide up completely connected, weighted graph
  • 48. Lachnozilla in graph form (it all makes sense now) Legionaminic acid Acetylneuraminic acid (pathogen associated) Bacteroides pectinophilus Butyrivibrio proteoclasticus Eubacterium plexicaudatum Roseburia Neighbors Weirdly named isolates
  • 49. Mystery isolate #1 (made-up example)
  • 50. Mystery isolate #2 (made-up example)
  • 51. Representations Clear inference From pattern to understanding uestions