SlideShare a Scribd company logo
1 of 35
Download to read offline
EVE 161:

Microbial Phylogenomics
Era IV:
Shotgun Metagenomics
UC Davis, Winter 2016
Instructor: Jonathan Eisen
!1
!2
Environmental Genome Shotgun
Sequencing of the Sargasso Sea
J. Craig Venter,1
* Karin Remington,1
John F. Heidelberg,3
Aaron L. Halpern,2
Doug Rusch,2
Jonathan A. Eisen,3
Dongying Wu,3
Ian Paulsen,3
Karen E. Nelson,3
William Nelson,3
Derrick E. Fouts,3
Samuel Levy,2
Anthony H. Knap,6
Michael W. Lomas,6
Ken Nealson,5
Owen White,3
Jeremy Peterson,3
Jeff Hoffman,1
Rachel Parsons,6
Holly Baden-Tillson,1
Cynthia Pfannkoch,1
Yu-Hui Rogers,4
Hamilton O. Smith1
We have applied “whole-genome shotgun sequencing” to microbial populations
collected en masse on tangential flow and impact filters from seawater samples
collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs
chlo
pho
wer
from
Feb
lect
stat
are
S1;
one
was
gen
2 to
prep
both
Cra
nolo
ers
RESEARCH ARTICLE
We have applied “whole-genome shotgun sequencing” to
microbial populations collected en masse on tangential flow
and impact filters from seawater samples collected from the
Sargasso Sea near Bermuda. A total of 1.045 billion base
pairs of nonredundant sequence was generated, annotated,
and analyzed to elucidate the gene content, diversity, and
relative abundance of the organisms within these
environmental samples. These data are estimated to derive
from at least 1800 genomic species based on sequence
relatedness, including 148 previously unknown bacterial
phylotypes. We have identified over 1.2 million previously
unknown genes represented in these samples, including more
than 782 new rhodopsin-like photoreceptors. Variation in
species present and stoichiometry suggests substantial
oceanic microbial diversity.
Sargasso Sea
tinct strains closely related to the published with shading in table S3 with nine depicted in
Fig. 1. MODIS-Aqua satellite image of
ocean chlorophyll in the Sargasso Sea grid
about the BATS site from 22 February
2003. The station locations are overlain
with their respective identifications. Note
the elevated levels of chlorophyll (green
color shades) around station 3, which are
not present around stations 11 and 13.
Fig. 2. Gene conser- !4
http://www.sciencemag.org/content/304/5667/66
• Why did the authors focus on the Sargasso Sea?
The Sargasso Sea. The northwest Sargasso Sea, at the Bermuda Atlantic Time-series Study site
(BATS), is one of the best-studied and arguably most well-characterized regions of the global
ocean. The Gulf Stream represents the western and northern boundaries of this region and provides
a strong physical boundary, separating the low nutrient, oligotrophic open ocean from the more
nutrient-rich waters of the U.S. continental shelf. The Sargasso Sea has been intensively studied as
part of the 50-year time series of ocean physics and biogeochemistry (3, 4) and provides an
opportunity for interpretation of environmental genomic data in an oceanographic context. In this
region, formation of subtropical mode water occurs each winter as the passage of cold fronts across
the region erodes the seasonal thermocline and causes convective mixing, resulting in mixed layers
of 150 to 300 m depth. The introduction of nutrient-rich deep water, following the breakdown of
seasonal thermoclines into the brightly lit surface waters, leads to the blooming of single cell
phytoplankton, including two cyanobacteria species, Synechococcus and Prochlorococcus, that
numerically dominate the photosynthetic biomass in the Sargasso Sea.
Sargasso Sea (from Wikipedia)
Sargasso Sea (from Wikipedia)
http://www.nature.com/nature/journal/v345/n6270/abs/345060a0.html
© 1990 Nature
Vol. 57, No. 6APPLIED AND ENVIRONMENTAL MICROBIOLOGY, June 1991, p. 1707-1713
0099-2240/91/061707-07$02.00/0
Copyright C) 1991, American Society for Microbiology
Phylogenetic Analysis of a Natural Marine Bacterioplankton
Population by rRNA Gene Cloning and Sequencing
THERESA B. BRITSCHGI AND STEPHEN J. GIOVANNONI*
Department of Microbiology, Oregon State University, Corvallis, Oregon 97331
Received 5 February 1991/Accepted 25 March 1991
The identification of the prokaryotic species which constitute marine bacterioplankton communities has been
a long-standing problem in marine microbiology. To address this question, we used the polymerase chain
reaction to construct and analyze a library of 51 small-subunit (16S) rRNA genes cloned from Sargasso Sea
bacterioplankton genomic DNA. Oligonucleotides complementary to conserved regions in the 16S rDNAs of
eubacteria were used to direct the synthesis of polymerase chain reaction products, which were then cloned by
blunt-end ligation into the phagemid vector pBluescript. Restriction fragment length polymorphisms and
hybridizations to oligonucleotide probes for the SARll and marine Synechococcus phylogenetic groups
indicated the presence of at least seven classes of genes. The sequences of five unique rDNAs were determined
completely. In addition to 16S rRNA genes from the marine Synechococcus cluster and the previously identified
but uncultivated microbial group, the SARll cluster [S. J. Giovannoni, T. B. Britschgi, C. L. Moyer, and
K. G. Field, Nature (London) 345:60-63], two new gene classes were observed. Phylogenetic comparisons
indicated that these belonged to unknown species of a- and -y-proteobacteria. The data confirm the earlier
conclusion that a majority of planktonic bacteria are new species previously unrecognized by bacteriologists.
Within the past decade, it has become evident that bacte-
rioplankton contribute significantly to biomass and bio-
geochemical activity in planktonic systems (4, 36). Until
recently, progress in identifying the microbial species which
constitute these communities was slow, because the major-
ity of the organisms present (as measured by direct counts)
cannot be recovered in cultures (5, 14, 16). The discovery a
decade ago that unicellular cyanobacteria are widely distrib-
uted in the open ocean (34) and the more recent observation
of abundant marine prochlorophytes (3) have underscored
the uncertainty of our knowledge about bacterioplankton
community structure.
Molecular approaches are now providing genetic markers
for the dominant bacterial species in natural microbial pop-
ulations (6, 21, 33). The immediate objectives of these
studies are twofold: (i) to identify species, both known and
novel, by reference to sequence data bases; and (ii) to
construct species-specific probes which can be used in
quantitative ecological studies (7, 29).
Previously, we reported rRNA genetic markers and quan-
titative hybridization data which demonstrated that a novel
lineage belonging to the oa-proteobacteria was abundant in
the Sargasso Sea; we also identified novel lineages which
were very closely related to cultivated marine Synechococ-
cus spp. (6). In a study of a hot-spring ecosystem, Ward and
coworkers examined cloned fragments of 16S rRNA cDNAs
and found numerous novel lineages but no matches to genes
from cultivated hot-spring bacteria (33). These results sug-
gested that natural ecosystems in general may include spe-
cies which are unknown to microbiologists.
Although any gene may be used as a genetic marker,
rRNA genes offer distinct advantages. The extensive use of
16S rRNAs for studies of microbial systematics and evolu-
tion has resulted in large computer data bases, such as the
RNA Data Base Project, which encompass the phylogenetic
diversity found within culture collections. rRNA genes are
highly conserved and therefore can be used to examine
distant phylogenetic relationships with accuracy. The pres-
ence of large numbers of ribosomes within cells and the
regulation of their biosyntheses in proportion to cellular
growth rates make these molecules ideally suited for ecolog-
ical studies with nucleic acid probes.
Here we describe the partial analysis of a 16S rDNA
library prepared from photic zone Sargasso Sea bacterio-
plankton using a general approach for cloning eubacterial
16S rDNAs. The Sargasso Sea, a central oceanic gyre,
typifies the oligotrophic conditions of the open ocean, pro-
viding an example of a microbial community adapted to
extremely low-nutrient conditions. The results extend our
previous observations of high genetic diversity among
closely related lineages within this microbial community and
provide rDNA genetic markers for two novel eubacterial
lineages.
MATERIALS AND METHODS
Collection and amplification of rDNA. A surface (2-m)
bacterioplankton population was collected from hydrosta-
tion S (32°4'N, 64°23'W) in the Sargasso Sea as previously
described (8). The polymerase chain reaction (24) was used
to produce double-stranded rDNA from the mixed-microbi-
al-population genomic DNA prepared from the plankton
sample. The amplification primers were complementary to
conserved regions in the 5' and 3' regions of the 16S rRNA
genes. The amplification primers were the universal 1406R
(ACGGGCGGTGTGTRC) and eubacterial 68F (TNANAC
ATGCAAGTCGAKCG) (17). The reaction conditions were
as follows: 1 ,ug of template DNA, 10 ,ul of 10x reaction
buffer (500 mM KCI, 100 mM Tris HCI [pH 9.0 at 250C], 15
mM MgCl2, 0.1% gelatin, 1% Triton X-100), 1 U of Taq DNA
polymerase (Promega Biological Research Products, Madi-
son, Wis.), 1 ,uM (each) primer, 200 ,uM (each) dATP, dCTP,
dGTP, and dTTP in a 100-,ul total volume; 1 min at 94°C, 1
atUNIVOFCALIFDAVISonMay18,2010aem.asm.orgDownloadedfrom
MARINE BACTERIOPLANKTON 1711
Escherichia coli
Vibrio harveyi
inmm Vibrio anguillarum
- Oceanospirillum linum
- SAR 92
Caulobacter crescentus
SAR83
- Erythrobacter OCh 114
- Hyphomicrobium vulgare
Rhodopseudomonas marina
Rochalimaea quintana
Agrobacterium tumefaciens
SARI
-Es SAR95
SAR1I
CZ
* C)0._
u
10C)
00
.0
0Q
C)
0
0
O)4..
eP.
- Liverwort CP
*Anacystis nidulans
- Prochlorothrix
- SAR6
* SAR7
r SARIOO
SAR139
0
p.0
0
0 0.05 0.10 0.15
FIG. 4. Phylogenetic tree showing relationships of the rDNA clones from the Sargasso Sea to representative, cultivated species (2, 6, 18,
30, 31, 32, 35, 40). Positions of uncertain homology in regions containing insertions and deletions were omitted from the analysis.
Evolutionary distances were calculated by the method of Jukes and Cantor (15), which corrects for the effects of superimposed mutations.
The phylogenetic tree was determined by a distance matrix method (20). The tree was rooted with the sequence of Bacillus subtilis (38).
Sequence data not referenced were provided by C. R. Woese and R. Rossen.
phototroph phyla were conserved in the corresponding 880 (G -
C to C G). We note with caution that some classes
VOL. 57, 1991
I I
atUNIVaem.asm.orgDownloadedfrom
• What did they sample and how did they sample?
Sargasso Sea
tinct strains closely related to the published with shading in table S3 with nine depicted in
Fig. 1. MODIS-Aqua satellite image of
ocean chlorophyll in the Sargasso Sea grid
about the BATS site from 22 February
2003. The station locations are overlain
with their respective identifications. Note
the elevated levels of chlorophyll (green
color shades) around station 3, which are
not present around stations 11 and 13.
Fig. 2. Gene conser- !16
http://www.sciencemag.org/content/304/5667/66
Surface water samples (170 to 200 liters) were collected aboard the RV
Weatherbird II from three sites off the coast of Bermuda in February
2003. Additional samples were collected aboard the SV Sorcerer II from
“Hydrostation S” in May 2003. Sample site locations are indicated on
Fig. 1 and described in table S1; sampling protocols were fine-tuned
from one expedition to the next (5). Genomic DNA was extracted from
filters of 0.1 to 3.0 μm, and genomic libraries with insert sizes ranging
from 2 to 6 kb were made as described (5). The prepared plasmid
clones were sequenced from both ends to provide paired-end reads at
the J. Craig Venter Science Foundation Joint Technology Center on ABI
3730XL DNA sequencers (Applied Biosystems, Foster City, CA). Whole-
genome random shotgun sequencing of the Weatherbird II samples
(table S1, samples 1 to 4) produced 1.66 million reads averaging 818 bp
in length, for a total of approximately 1.36 Gbp of microbial DNA
sequence. An additional 325,561 sequences were generated from the
Sorcerer II samples (table S1, samples 5 to 7), yielding approximately
265 Mbp of DNA sequence.
• What are some questions one might want to answer
about the Sargasso Sea samples / sequences?
Who is out there?
rRNA phylotyping from metagenomics
!24
http://www.sciencemag.org/content/304/5667/66
• How can one do phylotyping with genes other than
rRNA?
Shotgun Sequencing Allows Alternative Anchors (e.g., RecA)
Who is out there?
nt can be used as an-
uish eukaryotic mate-
inverse relation was
e size of the pre-filters
d the fraction of se-
(table S5). This rela-
resence of 18S rRNA
strong evidence that
ndeed captured.
cies richness. Most
uncultured organisms
dies of rRNA genes
reaction (PCR) with
ved positions in those
small subunit rRNA
versity of prokaryotic
(17). However, PCR-
ly biased, because not
with the same “univer-
shotgun sequence data
ntified 1164 distinct
nes or fragments of
rd II assemblies and
Sorcerer II reads (5).
milarity cutoff to dis-
es, we identified 148
lotypes in our sample
the RDP II database
ity cutoff, this number
sequence similarity is
ate predictor of func-
sequence divergence
elate with the biologi-
defining species (also
y sequence similarity
the accepted standard
microbes. All sampled
d to taxonomic groups
nomic group using the phylogenetic analysis
described for rRNA. For example, our data set
marker genes, is roughly comparable to the
97% cutoff traditionally used for rRNA. Thus
Fig. 6. Phylogenetic diversity of Sargasso Sea sequences using multiple phylogenetic markers. The
relative contribution of organisms from different major phylogenetic groups (phylotypes) was
measured using multiple phylogenetic markers that have been used previously in phylogenetic
studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB). The
relative proportion of different phylotypes for each sequence (weighted by the depth of coverage
of the contigs from which those sequences came) is shown. The phylotype distribution was
determined as follows: (i) Sequences in the Sargasso data set corresponding to each of these genes
were identified using HMM and BLAST searches. (ii) Phylogenetic analysis was performed for each
phylogenetic marker identified in the Sargasso data separately compared with all members of that
gene family in all complete genome sequences (only complete genomes were used to control for
the differential sampling of these markers in GenBank). (iii) The phylogenetic affinity of each
sequence was assigned based on the classification of the nearest neighbor in the phylogenetic tree.
!27
method based on fitting the observed depth of
coverage to a theoretical model of assembly
progress for a sample corresponding to a mix-
that a minimum of 12-fold deeper sampling
would be required to obtain 95% of the unique
sequence. However, these are only lower
of se
ever
nity.
resen
know
scaff
cont
even
SAR
cove
fold,
21,0
popu
uted
V
key
proa
men
the r
isms
half
men
equa
colle
Table 2. Diversity of ubiquitous single copy protein coding phylogenetic markers. Protein column uses
symbols that identify six proteins encoded by exactly one gene in virtually all known bacteria. Sequence
ID specifies the GenBank identifier for corresponding E. coli sequence. Ortholog cutoff identifies BLASTx
e-value chosen to identify orthologs when querying the E. coli sequence against the complete Sargasso
Sea data set. Maximum fragment depth shows the number of reads satisfying the ortholog cutoff at the
point along the query for which this value is maximal. Observed “species” shows the number of distinct
clusters of reads from the maximum fragment depth column, after grouping reads whose containing
assemblies had an overlap of at least 40 bp with Ͼ 94% nucleotide identity (single-link clustering).
Singleton “species” shows the number of distinct clusters from the observed “species” column that
consist of a single read. Most abundant column shows the fraction of the maximum fragment depth that
consists of single largest cluster.
Protein Sequence ID
Ortholog
cutoff
Max.
fragment
depth
Observed
“species”
Singleton
“species”
Most
abundant
(%)
AtpD NTL01EC03653 1e-32 836 456 317 6
GyrB NTL01EC03620 1e-11 924 569 429 4
Hsp70 NT01EC0015 1e-31 812 515 394 4
RecA NTL01EC02639 1e-21 592 341 244 8
RpoB NTL01EC03885 1e-41 669 428 331 7
TufA NTL01EC03262 1e-41 597 397 307 3
Table 3. Diversity models based on depth of coverage. Each row corre-
sponds to an abundance class of organisms. The first column in each
model “fr(asm)” gives the fraction of the assembly consensus modeled
column) in the sample. The thi
a genome expected to be s
gives the resulting estimat
http://www.sciencemag.org/content/304/5667/66
Figure S6. Accumulation curve for rpoB. Observed (black) OTU counts for rpoB (based
on the fragment grouping summarized in Table 2), as well as the Chao1-corrected
estimate of total species (red; see (3)). Points are mean values of 1000 shufflings of the
observed data, while bars show 90% confidence intervals.
http://www.sciencemag.org/content/304/5667/66
What are they doing?
Community Function?
• Many attempts to treat community as a bag of genes
• The run pathway prediction tools on entire data set to try and
predict “community metabolism”
• Does not work very well
!31
Functional Inference from Metagenomics
• Can work well for individual genes
!32
What are they doing?
frames (5). A total of 69,901 novel genes be-
longing to 15,601 single link clusters were iden-
tified. The predicted genes were categorized
Fig. 5. Prochlorococcus-related scaffold 2223290 illustra
nity of closely related organisms, distinctly nonpunctate
global structure of Scaffold 2223290 with respect to asse
sequence alignment. Blue segments, contigs; green segme
stages of the assembly of fragments into the resulting c
fragments were initially assembled in several different
form the final contig structure. The multiple sequence
homogenous blend of haplotypes, none with sufficie
separate assembly.
Table 1. Gene count breakdown by TIGR role
category. Gene set includes those found on as-
semblies from samples 1 to 4 and fragment reads
from samples 5 to 7. A more detailed table, sep-
arating Weatherbird II samples from the Sorcerer II
samples is presented in the SOM (table S4). Note
that there are 28,023 genes which were classified
in more than one role category.
TIGR role category
Total
genes
Amino acid biosynthesis 37,118
Biosynthesis of cofactors,
prosthetic groups, and carriers
25,905
Cell envelope 27,883
Cellular processes 17,260
Central intermediary metabolism 13,639
DNA metabolism 25,346
Energy metabolism 69,718
Fatty acid and phospholipid
metabolism
18,558
Mobile and extrachromosomal
element functions
1,061
Protein fate 28,768
Protein synthesis 48,012
Purines, pyrimidines, nucleosides,
and nucleotides
19,912
Regulatory functions 8,392
Signal transduction 4,817
Transcription 12,756
Transport and binding proteins 49,185
Unknown function 38,067
Miscellaneous 1,864
Conserved hypothetical 794,061
Total number of roles assigned 1,242,230
Total number of genes 1,214,207
www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004
http://www.sciencemag.org/content/304/5667/66
Functional Diversity of Proteorhodopsins?
!34
http://www.sciencemag.org/content/304/5667/66
of the surface of a single cell3
. This
nt to produce substantial amounts of
efore, the high density of proteorho-
rane indicated by our calculations
otein has a signi®cant role in the
in amino-acid sequences were not restricted to the hydrophilic
ed membranes
e-treated membranes
stituted membranes
10–3
AU
10–1
s
at 500 nm of a Monterey Bay bacterioplankton
tion of hydroxylamine; middle, after 0.2 M
C, with 500-nm illumination for 30 min; bottom,
n in 100 mM phosphate buffer, pH 7.0, followed
incubation for 1 h.
MB 0m2
MB 0m1
MB 20m2
MB 20m5
MB 40m12
MB 100m9
HOT 75m3
HOT 75m8
0.01
MontereyBayandshallowHOT
AntarcticaanddeepHOT
HOT 75m1
PAL B1
PAL B5
PAL B6
PAL E7
PAL E1
PAL B7
PAL B2
PAL B8
MB 100m10
MB 100m5
MB 100m7
MB 20m12
MB 40m1
MB 40m5
BAC 40E8
BAC 31A8
PAL E6
BAC 64A5
HOT 0m1
HOT 75m4
Figure 3 Phylogenetic analysis of the inferred amino-acid sequence of cloned
proteorhodopsin genes. Distance analysis of 220 positions was used to calculate the tree
by neighbour-joining using the PaupSearch program of the Wisconsin Package version
10.0 (Genetics Computer Group; Madison, Wisconsin). H. salinarum bacteriorhodopsin
was used as an outgroup, and is not shown. Scale bar represents number of substitutions
per site. Bold names indicate the proteorhodopsins that were spectrally characterized in
this study. !35

More Related Content

What's hot

UC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomicsUC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomics
Jonathan Eisen
 
Macromolecule evolution
Macromolecule  evolutionMacromolecule  evolution
Macromolecule evolution
Paula Mills
 

What's hot (20)

EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9
 
UC Davis EVE161 Lecture 17 by @phylogenomics
 UC Davis EVE161 Lecture 17 by @phylogenomics UC Davis EVE161 Lecture 17 by @phylogenomics
UC Davis EVE161 Lecture 17 by @phylogenomics
 
UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomics
 
EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17EVE 161 Winter 2018 Class 17
EVE 161 Winter 2018 Class 17
 
Microbial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 16: Shotgun MetagenomicsMicrobial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 16: Shotgun Metagenomics
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
 
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from UnculturedMicrobial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomics
 
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomicsUC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomics
 
UC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomicsUC Davis EVE161 Lecture 10 by @phylogenomics
UC Davis EVE161 Lecture 10 by @phylogenomics
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18
 
Metagenomics analysis
Metagenomics  analysisMetagenomics  analysis
Metagenomics analysis
 
Evolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processesEvolution of DNA repair genes, proteins and processes
Evolution of DNA repair genes, proteins and processes
 
Macromolecule evolution
Macromolecule  evolutionMacromolecule  evolution
Macromolecule evolution
 
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 

Similar to Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics

2013.14HABposter.ERNEdit
2013.14HABposter.ERNEdit2013.14HABposter.ERNEdit
2013.14HABposter.ERNEdit
Alicia Romero
 
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
Mai wassel
 
U. littoralis research symposium poster
U. littoralis research symposium posterU. littoralis research symposium poster
U. littoralis research symposium poster
Lauren Stoneburner
 
Molecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricusMolecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricus
volcker
 
Spatiotemporal Analysis of Bacterial Diversity in Sediments
Spatiotemporal Analysis of Bacterial Diversity in SedimentsSpatiotemporal Analysis of Bacterial Diversity in Sediments
Spatiotemporal Analysis of Bacterial Diversity in Sediments
Pijush Basak
 
Grimmett et al., growth rate hypothesis
Grimmett et al., growth rate hypothesisGrimmett et al., growth rate hypothesis
Grimmett et al., growth rate hypothesis
Ivan Grimmett
 
Caribbean placozoan phylogeography
Caribbean placozoan phylogeographyCaribbean placozoan phylogeography
Caribbean placozoan phylogeography
dreicash
 
Gutell 058.pnas.1996.093.11907
Gutell 058.pnas.1996.093.11907Gutell 058.pnas.1996.093.11907
Gutell 058.pnas.1996.093.11907
Robin Gutell
 

Similar to Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics (20)

2013.14HABposter.ERNEdit
2013.14HABposter.ERNEdit2013.14HABposter.ERNEdit
2013.14HABposter.ERNEdit
 
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
composition-of-eukaryotic-and-prokaryotic-rrna-gene-phylotypes-in-guts-of-adu...
 
Marine microbiology
Marine microbiologyMarine microbiology
Marine microbiology
 
U. littoralis research symposium poster
U. littoralis research symposium posterU. littoralis research symposium poster
U. littoralis research symposium poster
 
Molecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricusMolecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricus
 
Molecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricusMolecular and cytogenetic phylogeography of h. malabaricus
Molecular and cytogenetic phylogeography of h. malabaricus
 
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdfPaper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
Paper to Upload, MOLECULAR PHYLOGENY OF CATFISHES.pdf
 
Spatiotemporal Analysis of Bacterial Diversity in Sediments
Spatiotemporal Analysis of Bacterial Diversity in SedimentsSpatiotemporal Analysis of Bacterial Diversity in Sediments
Spatiotemporal Analysis of Bacterial Diversity in Sediments
 
Coastal marine ecosystem scientific paper
Coastal marine ecosystem scientific paper Coastal marine ecosystem scientific paper
Coastal marine ecosystem scientific paper
 
Grimmett et al., growth rate hypothesis
Grimmett et al., growth rate hypothesisGrimmett et al., growth rate hypothesis
Grimmett et al., growth rate hypothesis
 
Caribbean placozoan phylogeography
Caribbean placozoan phylogeographyCaribbean placozoan phylogeography
Caribbean placozoan phylogeography
 
Bacterial Community Profiling of the Arabian Sea Oxygen Minimum Zone Sediment...
Bacterial Community Profiling of the Arabian Sea Oxygen Minimum Zone Sediment...Bacterial Community Profiling of the Arabian Sea Oxygen Minimum Zone Sediment...
Bacterial Community Profiling of the Arabian Sea Oxygen Minimum Zone Sediment...
 
Metagenomics .pptx
Metagenomics .pptxMetagenomics .pptx
Metagenomics .pptx
 
Metagenomics , Applications, Techniques And Limitations .pptx
Metagenomics , Applications, Techniques And Limitations .pptxMetagenomics , Applications, Techniques And Limitations .pptx
Metagenomics , Applications, Techniques And Limitations .pptx
 
Ecomorfologia
EcomorfologiaEcomorfologia
Ecomorfologia
 
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
2019 - Profiling of filamentous bacteria in activated sludge by 16s RNA ampli...
 
Microbial diversity in deep sediments of Lake Baikal
Microbial diversity in deep sediments of Lake BaikalMicrobial diversity in deep sediments of Lake Baikal
Microbial diversity in deep sediments of Lake Baikal
 
Longo et al. 2014
Longo et al. 2014Longo et al. 2014
Longo et al. 2014
 
The effects of different water quality parameters on zooplankton distribution...
The effects of different water quality parameters on zooplankton distribution...The effects of different water quality parameters on zooplankton distribution...
The effects of different water quality parameters on zooplankton distribution...
 
Gutell 058.pnas.1996.093.11907
Gutell 058.pnas.1996.093.11907Gutell 058.pnas.1996.093.11907
Gutell 058.pnas.1996.093.11907
 

More from Jonathan Eisen

EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
Jonathan Eisen
 

More from Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 
Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...Evolution of microbiomes and the evolution of the study and politics of micro...
Evolution of microbiomes and the evolution of the study and politics of micro...
 

Recently uploaded

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Recently uploaded (20)

VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics

  • 1. EVE 161:
 Microbial Phylogenomics Era IV: Shotgun Metagenomics UC Davis, Winter 2016 Instructor: Jonathan Eisen !1
  • 2. !2 Environmental Genome Shotgun Sequencing of the Sargasso Sea J. Craig Venter,1 * Karin Remington,1 John F. Heidelberg,3 Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3 Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3 Derrick E. Fouts,3 Samuel Levy,2 Anthony H. Knap,6 Michael W. Lomas,6 Ken Nealson,5 Owen White,3 Jeremy Peterson,3 Jeff Hoffman,1 Rachel Parsons,6 Holly Baden-Tillson,1 Cynthia Pfannkoch,1 Yu-Hui Rogers,4 Hamilton O. Smith1 We have applied “whole-genome shotgun sequencing” to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs chlo pho wer from Feb lect stat are S1; one was gen 2 to prep both Cra nolo ers RESEARCH ARTICLE
  • 3. We have applied “whole-genome shotgun sequencing” to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These data are estimated to derive from at least 1800 genomic species based on sequence relatedness, including 148 previously unknown bacterial phylotypes. We have identified over 1.2 million previously unknown genes represented in these samples, including more than 782 new rhodopsin-like photoreceptors. Variation in species present and stoichiometry suggests substantial oceanic microbial diversity.
  • 4. Sargasso Sea tinct strains closely related to the published with shading in table S3 with nine depicted in Fig. 1. MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003. The station locations are overlain with their respective identifications. Note the elevated levels of chlorophyll (green color shades) around station 3, which are not present around stations 11 and 13. Fig. 2. Gene conser- !4 http://www.sciencemag.org/content/304/5667/66
  • 5. • Why did the authors focus on the Sargasso Sea?
  • 6. The Sargasso Sea. The northwest Sargasso Sea, at the Bermuda Atlantic Time-series Study site (BATS), is one of the best-studied and arguably most well-characterized regions of the global ocean. The Gulf Stream represents the western and northern boundaries of this region and provides a strong physical boundary, separating the low nutrient, oligotrophic open ocean from the more nutrient-rich waters of the U.S. continental shelf. The Sargasso Sea has been intensively studied as part of the 50-year time series of ocean physics and biogeochemistry (3, 4) and provides an opportunity for interpretation of environmental genomic data in an oceanographic context. In this region, formation of subtropical mode water occurs each winter as the passage of cold fronts across the region erodes the seasonal thermocline and causes convective mixing, resulting in mixed layers of 150 to 300 m depth. The introduction of nutrient-rich deep water, following the breakdown of seasonal thermoclines into the brightly lit surface waters, leads to the blooming of single cell phytoplankton, including two cyanobacteria species, Synechococcus and Prochlorococcus, that numerically dominate the photosynthetic biomass in the Sargasso Sea.
  • 7.
  • 8. Sargasso Sea (from Wikipedia)
  • 9. Sargasso Sea (from Wikipedia)
  • 10.
  • 13.
  • 14. Vol. 57, No. 6APPLIED AND ENVIRONMENTAL MICROBIOLOGY, June 1991, p. 1707-1713 0099-2240/91/061707-07$02.00/0 Copyright C) 1991, American Society for Microbiology Phylogenetic Analysis of a Natural Marine Bacterioplankton Population by rRNA Gene Cloning and Sequencing THERESA B. BRITSCHGI AND STEPHEN J. GIOVANNONI* Department of Microbiology, Oregon State University, Corvallis, Oregon 97331 Received 5 February 1991/Accepted 25 March 1991 The identification of the prokaryotic species which constitute marine bacterioplankton communities has been a long-standing problem in marine microbiology. To address this question, we used the polymerase chain reaction to construct and analyze a library of 51 small-subunit (16S) rRNA genes cloned from Sargasso Sea bacterioplankton genomic DNA. Oligonucleotides complementary to conserved regions in the 16S rDNAs of eubacteria were used to direct the synthesis of polymerase chain reaction products, which were then cloned by blunt-end ligation into the phagemid vector pBluescript. Restriction fragment length polymorphisms and hybridizations to oligonucleotide probes for the SARll and marine Synechococcus phylogenetic groups indicated the presence of at least seven classes of genes. The sequences of five unique rDNAs were determined completely. In addition to 16S rRNA genes from the marine Synechococcus cluster and the previously identified but uncultivated microbial group, the SARll cluster [S. J. Giovannoni, T. B. Britschgi, C. L. Moyer, and K. G. Field, Nature (London) 345:60-63], two new gene classes were observed. Phylogenetic comparisons indicated that these belonged to unknown species of a- and -y-proteobacteria. The data confirm the earlier conclusion that a majority of planktonic bacteria are new species previously unrecognized by bacteriologists. Within the past decade, it has become evident that bacte- rioplankton contribute significantly to biomass and bio- geochemical activity in planktonic systems (4, 36). Until recently, progress in identifying the microbial species which constitute these communities was slow, because the major- ity of the organisms present (as measured by direct counts) cannot be recovered in cultures (5, 14, 16). The discovery a decade ago that unicellular cyanobacteria are widely distrib- uted in the open ocean (34) and the more recent observation of abundant marine prochlorophytes (3) have underscored the uncertainty of our knowledge about bacterioplankton community structure. Molecular approaches are now providing genetic markers for the dominant bacterial species in natural microbial pop- ulations (6, 21, 33). The immediate objectives of these studies are twofold: (i) to identify species, both known and novel, by reference to sequence data bases; and (ii) to construct species-specific probes which can be used in quantitative ecological studies (7, 29). Previously, we reported rRNA genetic markers and quan- titative hybridization data which demonstrated that a novel lineage belonging to the oa-proteobacteria was abundant in the Sargasso Sea; we also identified novel lineages which were very closely related to cultivated marine Synechococ- cus spp. (6). In a study of a hot-spring ecosystem, Ward and coworkers examined cloned fragments of 16S rRNA cDNAs and found numerous novel lineages but no matches to genes from cultivated hot-spring bacteria (33). These results sug- gested that natural ecosystems in general may include spe- cies which are unknown to microbiologists. Although any gene may be used as a genetic marker, rRNA genes offer distinct advantages. The extensive use of 16S rRNAs for studies of microbial systematics and evolu- tion has resulted in large computer data bases, such as the RNA Data Base Project, which encompass the phylogenetic diversity found within culture collections. rRNA genes are highly conserved and therefore can be used to examine distant phylogenetic relationships with accuracy. The pres- ence of large numbers of ribosomes within cells and the regulation of their biosyntheses in proportion to cellular growth rates make these molecules ideally suited for ecolog- ical studies with nucleic acid probes. Here we describe the partial analysis of a 16S rDNA library prepared from photic zone Sargasso Sea bacterio- plankton using a general approach for cloning eubacterial 16S rDNAs. The Sargasso Sea, a central oceanic gyre, typifies the oligotrophic conditions of the open ocean, pro- viding an example of a microbial community adapted to extremely low-nutrient conditions. The results extend our previous observations of high genetic diversity among closely related lineages within this microbial community and provide rDNA genetic markers for two novel eubacterial lineages. MATERIALS AND METHODS Collection and amplification of rDNA. A surface (2-m) bacterioplankton population was collected from hydrosta- tion S (32°4'N, 64°23'W) in the Sargasso Sea as previously described (8). The polymerase chain reaction (24) was used to produce double-stranded rDNA from the mixed-microbi- al-population genomic DNA prepared from the plankton sample. The amplification primers were complementary to conserved regions in the 5' and 3' regions of the 16S rRNA genes. The amplification primers were the universal 1406R (ACGGGCGGTGTGTRC) and eubacterial 68F (TNANAC ATGCAAGTCGAKCG) (17). The reaction conditions were as follows: 1 ,ug of template DNA, 10 ,ul of 10x reaction buffer (500 mM KCI, 100 mM Tris HCI [pH 9.0 at 250C], 15 mM MgCl2, 0.1% gelatin, 1% Triton X-100), 1 U of Taq DNA polymerase (Promega Biological Research Products, Madi- son, Wis.), 1 ,uM (each) primer, 200 ,uM (each) dATP, dCTP, dGTP, and dTTP in a 100-,ul total volume; 1 min at 94°C, 1 atUNIVOFCALIFDAVISonMay18,2010aem.asm.orgDownloadedfrom MARINE BACTERIOPLANKTON 1711 Escherichia coli Vibrio harveyi inmm Vibrio anguillarum - Oceanospirillum linum - SAR 92 Caulobacter crescentus SAR83 - Erythrobacter OCh 114 - Hyphomicrobium vulgare Rhodopseudomonas marina Rochalimaea quintana Agrobacterium tumefaciens SARI -Es SAR95 SAR1I CZ * C)0._ u 10C) 00 .0 0Q C) 0 0 O)4.. eP. - Liverwort CP *Anacystis nidulans - Prochlorothrix - SAR6 * SAR7 r SARIOO SAR139 0 p.0 0 0 0.05 0.10 0.15 FIG. 4. Phylogenetic tree showing relationships of the rDNA clones from the Sargasso Sea to representative, cultivated species (2, 6, 18, 30, 31, 32, 35, 40). Positions of uncertain homology in regions containing insertions and deletions were omitted from the analysis. Evolutionary distances were calculated by the method of Jukes and Cantor (15), which corrects for the effects of superimposed mutations. The phylogenetic tree was determined by a distance matrix method (20). The tree was rooted with the sequence of Bacillus subtilis (38). Sequence data not referenced were provided by C. R. Woese and R. Rossen. phototroph phyla were conserved in the corresponding 880 (G - C to C G). We note with caution that some classes VOL. 57, 1991 I I atUNIVaem.asm.orgDownloadedfrom
  • 15. • What did they sample and how did they sample?
  • 16. Sargasso Sea tinct strains closely related to the published with shading in table S3 with nine depicted in Fig. 1. MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003. The station locations are overlain with their respective identifications. Note the elevated levels of chlorophyll (green color shades) around station 3, which are not present around stations 11 and 13. Fig. 2. Gene conser- !16 http://www.sciencemag.org/content/304/5667/66
  • 17. Surface water samples (170 to 200 liters) were collected aboard the RV Weatherbird II from three sites off the coast of Bermuda in February 2003. Additional samples were collected aboard the SV Sorcerer II from “Hydrostation S” in May 2003. Sample site locations are indicated on Fig. 1 and described in table S1; sampling protocols were fine-tuned from one expedition to the next (5). Genomic DNA was extracted from filters of 0.1 to 3.0 μm, and genomic libraries with insert sizes ranging from 2 to 6 kb were made as described (5). The prepared plasmid clones were sequenced from both ends to provide paired-end reads at the J. Craig Venter Science Foundation Joint Technology Center on ABI 3730XL DNA sequencers (Applied Biosystems, Foster City, CA). Whole- genome random shotgun sequencing of the Weatherbird II samples (table S1, samples 1 to 4) produced 1.66 million reads averaging 818 bp in length, for a total of approximately 1.36 Gbp of microbial DNA sequence. An additional 325,561 sequences were generated from the Sorcerer II samples (table S1, samples 5 to 7), yielding approximately 265 Mbp of DNA sequence.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. • What are some questions one might want to answer about the Sargasso Sea samples / sequences?
  • 23. Who is out there?
  • 24. rRNA phylotyping from metagenomics !24 http://www.sciencemag.org/content/304/5667/66
  • 25. • How can one do phylotyping with genes other than rRNA?
  • 26. Shotgun Sequencing Allows Alternative Anchors (e.g., RecA)
  • 27. Who is out there? nt can be used as an- uish eukaryotic mate- inverse relation was e size of the pre-filters d the fraction of se- (table S5). This rela- resence of 18S rRNA strong evidence that ndeed captured. cies richness. Most uncultured organisms dies of rRNA genes reaction (PCR) with ved positions in those small subunit rRNA versity of prokaryotic (17). However, PCR- ly biased, because not with the same “univer- shotgun sequence data ntified 1164 distinct nes or fragments of rd II assemblies and Sorcerer II reads (5). milarity cutoff to dis- es, we identified 148 lotypes in our sample the RDP II database ity cutoff, this number sequence similarity is ate predictor of func- sequence divergence elate with the biologi- defining species (also y sequence similarity the accepted standard microbes. All sampled d to taxonomic groups nomic group using the phylogenetic analysis described for rRNA. For example, our data set marker genes, is roughly comparable to the 97% cutoff traditionally used for rRNA. Thus Fig. 6. Phylogenetic diversity of Sargasso Sea sequences using multiple phylogenetic markers. The relative contribution of organisms from different major phylogenetic groups (phylotypes) was measured using multiple phylogenetic markers that have been used previously in phylogenetic studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB). The relative proportion of different phylotypes for each sequence (weighted by the depth of coverage of the contigs from which those sequences came) is shown. The phylotype distribution was determined as follows: (i) Sequences in the Sargasso data set corresponding to each of these genes were identified using HMM and BLAST searches. (ii) Phylogenetic analysis was performed for each phylogenetic marker identified in the Sargasso data separately compared with all members of that gene family in all complete genome sequences (only complete genomes were used to control for the differential sampling of these markers in GenBank). (iii) The phylogenetic affinity of each sequence was assigned based on the classification of the nearest neighbor in the phylogenetic tree. !27
  • 28. method based on fitting the observed depth of coverage to a theoretical model of assembly progress for a sample corresponding to a mix- that a minimum of 12-fold deeper sampling would be required to obtain 95% of the unique sequence. However, these are only lower of se ever nity. resen know scaff cont even SAR cove fold, 21,0 popu uted V key proa men the r isms half men equa colle Table 2. Diversity of ubiquitous single copy protein coding phylogenetic markers. Protein column uses symbols that identify six proteins encoded by exactly one gene in virtually all known bacteria. Sequence ID specifies the GenBank identifier for corresponding E. coli sequence. Ortholog cutoff identifies BLASTx e-value chosen to identify orthologs when querying the E. coli sequence against the complete Sargasso Sea data set. Maximum fragment depth shows the number of reads satisfying the ortholog cutoff at the point along the query for which this value is maximal. Observed “species” shows the number of distinct clusters of reads from the maximum fragment depth column, after grouping reads whose containing assemblies had an overlap of at least 40 bp with Ͼ 94% nucleotide identity (single-link clustering). Singleton “species” shows the number of distinct clusters from the observed “species” column that consist of a single read. Most abundant column shows the fraction of the maximum fragment depth that consists of single largest cluster. Protein Sequence ID Ortholog cutoff Max. fragment depth Observed “species” Singleton “species” Most abundant (%) AtpD NTL01EC03653 1e-32 836 456 317 6 GyrB NTL01EC03620 1e-11 924 569 429 4 Hsp70 NT01EC0015 1e-31 812 515 394 4 RecA NTL01EC02639 1e-21 592 341 244 8 RpoB NTL01EC03885 1e-41 669 428 331 7 TufA NTL01EC03262 1e-41 597 397 307 3 Table 3. Diversity models based on depth of coverage. Each row corre- sponds to an abundance class of organisms. The first column in each model “fr(asm)” gives the fraction of the assembly consensus modeled column) in the sample. The thi a genome expected to be s gives the resulting estimat http://www.sciencemag.org/content/304/5667/66
  • 29. Figure S6. Accumulation curve for rpoB. Observed (black) OTU counts for rpoB (based on the fragment grouping summarized in Table 2), as well as the Chao1-corrected estimate of total species (red; see (3)). Points are mean values of 1000 shufflings of the observed data, while bars show 90% confidence intervals. http://www.sciencemag.org/content/304/5667/66
  • 30. What are they doing?
  • 31. Community Function? • Many attempts to treat community as a bag of genes • The run pathway prediction tools on entire data set to try and predict “community metabolism” • Does not work very well !31
  • 32. Functional Inference from Metagenomics • Can work well for individual genes !32
  • 33. What are they doing? frames (5). A total of 69,901 novel genes be- longing to 15,601 single link clusters were iden- tified. The predicted genes were categorized Fig. 5. Prochlorococcus-related scaffold 2223290 illustra nity of closely related organisms, distinctly nonpunctate global structure of Scaffold 2223290 with respect to asse sequence alignment. Blue segments, contigs; green segme stages of the assembly of fragments into the resulting c fragments were initially assembled in several different form the final contig structure. The multiple sequence homogenous blend of haplotypes, none with sufficie separate assembly. Table 1. Gene count breakdown by TIGR role category. Gene set includes those found on as- semblies from samples 1 to 4 and fragment reads from samples 5 to 7. A more detailed table, sep- arating Weatherbird II samples from the Sorcerer II samples is presented in the SOM (table S4). Note that there are 28,023 genes which were classified in more than one role category. TIGR role category Total genes Amino acid biosynthesis 37,118 Biosynthesis of cofactors, prosthetic groups, and carriers 25,905 Cell envelope 27,883 Cellular processes 17,260 Central intermediary metabolism 13,639 DNA metabolism 25,346 Energy metabolism 69,718 Fatty acid and phospholipid metabolism 18,558 Mobile and extrachromosomal element functions 1,061 Protein fate 28,768 Protein synthesis 48,012 Purines, pyrimidines, nucleosides, and nucleotides 19,912 Regulatory functions 8,392 Signal transduction 4,817 Transcription 12,756 Transport and binding proteins 49,185 Unknown function 38,067 Miscellaneous 1,864 Conserved hypothetical 794,061 Total number of roles assigned 1,242,230 Total number of genes 1,214,207 www.sciencemag.org SCIENCE VOL 304 2 APRIL 2004 http://www.sciencemag.org/content/304/5667/66
  • 34. Functional Diversity of Proteorhodopsins? !34 http://www.sciencemag.org/content/304/5667/66
  • 35. of the surface of a single cell3 . This nt to produce substantial amounts of efore, the high density of proteorho- rane indicated by our calculations otein has a signi®cant role in the in amino-acid sequences were not restricted to the hydrophilic ed membranes e-treated membranes stituted membranes 10–3 AU 10–1 s at 500 nm of a Monterey Bay bacterioplankton tion of hydroxylamine; middle, after 0.2 M C, with 500-nm illumination for 30 min; bottom, n in 100 mM phosphate buffer, pH 7.0, followed incubation for 1 h. MB 0m2 MB 0m1 MB 20m2 MB 20m5 MB 40m12 MB 100m9 HOT 75m3 HOT 75m8 0.01 MontereyBayandshallowHOT AntarcticaanddeepHOT HOT 75m1 PAL B1 PAL B5 PAL B6 PAL E7 PAL E1 PAL B7 PAL B2 PAL B8 MB 100m10 MB 100m5 MB 100m7 MB 20m12 MB 40m1 MB 40m5 BAC 40E8 BAC 31A8 PAL E6 BAC 64A5 HOT 0m1 HOT 75m4 Figure 3 Phylogenetic analysis of the inferred amino-acid sequence of cloned proteorhodopsin genes. Distance analysis of 220 positions was used to calculate the tree by neighbour-joining using the PaupSearch program of the Wisconsin Package version 10.0 (Genetics Computer Group; Madison, Wisconsin). H. salinarum bacteriorhodopsin was used as an outgroup, and is not shown. Scale bar represents number of substitutions per site. Bold names indicate the proteorhodopsins that were spectrally characterized in this study. !35