AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
UC Davis EVE161 Lecture 11 by @phylogenomics
1. Lecture 10:
EVE 161:
Microbial Phylogenomics
!
Lecture #10:
Era III: Genome Sequencing
!
UC Davis, Winter 2014
Instructor: Jonathan Eisen
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!1
2. Where we are going and where we have been
• Previous lecture:
! 10: Genome Sequencing
• Current Lecture:
! 11: Genome Sequencing II
• Next Lecture:
! 12: Genome Sequencing III
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!2
5. mosome encodes all the housekeeping functions. Because plasmids typically have only
Structural Diversity
TABLE 7.1. Examples of bacteria with multiple genetic elements
Species
Form
Size (kb)
Shape
Streptomyces coelicolor
Chromosome
Plasmid
Plasmid
8667
356
31
Linear
Linear
Circular
Agrobacterium tumefaciens
Chromosome
Chromosome
Plasmid
Plasmid
2842
2057
543
214
Circular
Linear
Circular
Circular
Borrelia burgdorferi
Chromosome
Plasmid (n = 11)
911
9–54
Linear
Circular/Linear
Brucella melitensis
Chromosome
Chromosome
2117
1178
Circular
Circular
Clostridium acetobutylicum
Chromosome
Plasmid
3941
192
Circular
Circular
Deinococcus radiodurans
Chromosome
Plasmid
Plasmid
Plasmid
2649
412
177
46
Circular
Circular
Circular
Circular
Ralstonia solanacearum
Chromosome
Chromosome?
3716
2095
Circular
Circular
Salmonella typhi
Chromosome
Plasmid
Plasmid
4809
218
107
Circular
Circular
Circular
Sinorhizobium meliloti
Chromosome
Plasmid
Plasmid
3654
1683
1354
Circular
Circular
Circular
Vibrio cholerae
Chromosome
Chromosome
2941
1072
Circular
Circular
Yersinia pestis
Chromosome
Plasmid (n = 3)
4654
10–96
Circular
Circular
Based on Bentley S.D. and Parkhill J. Annu. Rev. Genet. 38: 771–792, as adapted from Ohmachi M. 2002.
Curr. Biol. 12: R427–428.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
6. What is a Plasmid
Chapter 7
•
B A CT ER IA L A N D A R CH A EA L G E N
TABLE 7.2. Plasmid functions
Genetic Function
of Plasmid
Gene Functions
Examples
Resistance
Antibiotic resistance
Rbk plasmid of Escherichia coli and other
bacteria
Fertility
Conjugation and DNA
transfer
F plasmid of E. coli
Killer
Synthesis of toxins that
kill other bacteria
Col plasmids of E. coli, for colicin production
Degradative
Enzymes for
metabolism of
unusual molecules
TOL plasmid of Pseudomonas putida, for
toluene metabolism
Virulence
Pathogenicity
Ti plasmid of Agrobacterium tumefaciens,
conferring the ability to cause crown gall
disease on dicotyledonous plants
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
8. Eukaryotic genomes are bulky in part because they contain large numbers of repetitive DNA Size
Genomeelements (Fig. 7.2). Common eukaryotic repetitive DNA elements include sim-
Leishmania Arabidopsis
major
thaliana
Guillardia theta
Human
Fern
Eukaryotes
Schizosac- Moss
charomyces
pombe
Cockroach
Paramecium
tetraurelia
Amoeba
dubia
Escherichia
coli
P. marius
Bacteria
Myxobacteria
Bradyrhizobium
japonicum
Nanoarchaeum
equitans
Archaea
Methanosarcina
acetivorans
1
105
1
106
1
107
1
108
1
109
1
1010
1
1011
1
1012
1
1013
Number of base pairs
FIGURE 7.1. Genome sizes in the three domains of life. A selection of genome sizes and size
ranges from specific groups of organisms is indicated.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
10. Gene Density
Chapter 7
•
BA CT ERIA L A N D AR C H A EA L G EN E T
A Human
0
10
20
30
40
50 kb
20
30
40
50 kb
B Escherichia coli
0
10
KEY
Gene
Human pseudogene
Repetitive DNA element
FIGURE 7.2. Genome density. Comparison of the genome density and content of humans and Es-
cherichia coli. Each segment is 50 kb in length and represents (A) a portion of the human β T-cell
receptor locus and (B) a region of the E. coli K12 genome. Note the much greater proportion of
genes (red boxes) in E. coli compared to humans.
ple sequence repeats (e.g., microsatellites and minisatellites), gene duplications (both tandem arrays and pseudogenes), and transposable by Jonathan Eisen Winter 2014
elements. Although bacterial and arSlides for UC Davis EVE161 Course Taught
12. DNA or selfish DNA.
Number of Genes Junk DNA appears to provide little benefit or no function to the
organism. (In some cases this designation is a misnomer resulting from a lack of infor30,000
25,000
Bacteria
Eukaryotes
Viruses
Archaea
Genes
20,000
15,000
10,000
5,000
0
105
106
107
108
Genome size
109
1010
FIGURE 7.3. Genome size vs. number of protein-coding genes. The number of genes is highly cor-
related to genome size for bacteria, archaea, and viruses, but less so for eukaryotes. Many archaeal
points (blue triangles) are hidden under bacterial ones (yellow squares).
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
14. Operons
O RI GI N AN D D I V E RSI F ICAT ION O F LIF E
lacZ
CAP
site
lacY
lacA
Operator
Promoter
-galactosidase
Lactose permease
transports lactose into
the cell
transacetylase
split lactose to galactose + glucose
CH2OH
OH
H
OH
H
H
CH2OH
CH2OH
H
O
H
+
O
H
H
OH
H
OH
Lactose
H
O OH
H
H
OH
OH
H
OH
H
H
CH2OH
O OH
H
H
OH
Galactose
+
H
H
OH
OH
H
O OH
H
H
OH
Glucose
FIGURE 7.4. Lac operon from Escherichia coli. This operon consists of three genes whose transcrip-
tion is regulated by a single promoter. The genes encode proteins involved in utilizing lactose, including a permease (encoded by lacY), which brings lactose into the cell from the outside, and two
enzymes (encoded by lacZ and lacA), which split lactose into glucose + galactose (see pp. 52–53).
mation. Some stretches of “junk DNA” have been by Jonathan Eisenbe involved in gene regSlides for UC Davis EVE161 Course Taught determined to Winter 2014
17. E. coli shared Genes
A N D D IV ER SIF ICAT ION OF LIF E
MG1655 (K-12)
nonpathogenic
CFT073
uropathogenic
193
585
1623
2996
514
204
FIGURE 7.7. Number of shared proteins be-
1346
EDL933 (0157:H7)
enterohemorrhagic
tween strains of Escherichia coli. Note the
large number of genes found in one strain
but not the others (seen in the outer portions
of each circle).
substantial variation in gene content among members of the same species have been
reported in other lineages of bacteria and archaea. Thus, the diminishing number of
core orthologous genes is simply an extension of something happening among close
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
relatives.
23. Origin of replication
Terminus of replication
Artificially Open Circle
Origin
Terminus
Origin
Again UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Slides for
24. Origin of replication
Terminus of replication
Artificially Open Circle
Genome 2
O
T
O
O
Origin
Terminus
T
Genome 1
Origin
Again UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Slides for
O
25. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
26. E. coli 0157:H7
A N D D I V E RSI FICAT ION O F LIF E
Repeat
Island
Inversion
E. coli K12
FIGURE 7.10. Conserved gene order in
the backbone of Escherichia coli K12 and
0157:H7. The two genomes were aligned
with each other and the matching regions
were plotted. The conserved order of
genes in the backbone of the two E. coli
strains is indicated by the diagonal line.
Three important genomic regions are circled. An island present in one of the two
strains causes a slight shift in the position
of the main diagonal.
they also occur in virtually the same order in both strains (Fig. 7.10). The genes unique
to each strain are clustered into “islands” interspersed among the stretches of common
genes. Similar patterns of DNA “islands” within a conserved genome backbone have
been found among other related bacteria or archaea.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
How do these islands originate? These are two possibilities: insertion of DNA into
27. Chapter 7
•
BACT ERIA L A ND A RCHA EA L G ENET ICS AND
H. pylori 26695 chromosome
1,667,867
1,600,000
1,200,000
FIGURE 7.11. The lack of conservation of
800,000
400,000
0
0
400,000
800,000 1,200,000
H. influenzae Rd chromosome
1,830,137
gene order between Haemophilus influenzae and Helicobacter pylori is illustrated.
Linearized chromosomes of H. influenzae
and H. pylori are plotted on the horizontal
and vertical axes, respectively. Each dot represents a single pair of orthologous proteins.
Genes in similar operons, which do exist,
are too close together to give separated
points on the scale used.
mon is symmetric inversion around the origin of replication (Fig. 7.14). Such inversions
are seen in almost every comparison of moderately closely related strains or species. Although other rearrangements occur, the symmetric inversions serve as a useful tool for
understanding some features of general evolution and we focus on them here.
Symmetric inversions around the origin are due to a combination of mutation bias
and selection bias.Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
To understand how mutation bias could cause this, it is helpful to un-
30. V. cholerae vs. E. coli All
5000000
E. coli Coordinates
4000000
3000000
2000000
1000000
0
0
1000000
2000000
V. cholerae Coordinates
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
3000000
Eisen et al., 2000
31. V. cholerae vs. E. coli Best
5000000
E. coli Coordinates
4000000
3000000
2000000
1000000
0
0
1000000
2000000
V. cholerae Coordinates
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
3000000
Eisen et al., 2000
32. V. cholerae vs. E. coli, Rotated
5000000
E. coli ORF Coordinates
4000000
3000000
2000000
1000000
0
0
500000
1000000
1500000
2000000
2500000
V. cholerae ORF Coordinates
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
3000000
Eisen et al., 2000
33. Duplication and Gene Loss Model
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Eisen et al., 2000
34. V. cholerae vs. E. coli
Orthologs on Both Diagonals
5000000
E. coli ORF Coordinates
4000000
3000000
2000000
1000000
0
0
500000
1000000
1500000
2000000
2500000
V. cholerae ORF Coordinates
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
3000000
Eisen et al., 2000
35. C. trachomatis vs C. pneumoniae
C. pneumoniae AR39
Origin
Terminus
C. trachomatis MoPn
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
37. The X-Files
Streps
Pseudomonas
B. subt vs. Staph
13623200
3000
9952000
2500
13622725
9950425
2000
Series1
9948850
Series1
13622250
1500
1000
9947275
13621775
500
9945700
0
0
2125
4250
6375
8500
M. tb vs. M. leprae
13621300
2632200
0
625
1250
1875
Pyrococcus
2632700
2633200
Mycobacterium tuberculosis
3000000
2000000
1000000
0
1000000
2000000
2634200
2634700
2635200
2635700
2636200
Thermoplasmas
4000000
0
2633700
2500
3000000
Mycobacterium leprae
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
2636700
38. Chapter 7
•
BACT E RI AL AND ARCH AEAL G EN ET ICS AND G E NO MI CS
181
A
A2
31 32 1 2
3
30
4
29
5
28
6
27
7
26
8
25
9
24
10
23
11
22
12
21
13
20
14
19
18 17 16 15
A1
31 32 1 2
3
30
4
29
5
28
6
27
7
26
8
25
9
24
10
23
11
22
12
21
13
20
14
19
18 17 16 15
Common
ancestor of
A and B
A3
A1
Inversion
around
terminus (*)
B1
31 32 1 2
3
30
4
29
5
28
6
27
7
26
8
25
9
24
10
23
11
22
12
21
13
20
19
14
15 16 17 18
A2
A2
Inversion
around
origin (*)
B2
A1
3
4
27
26
25
24
23
22
21
20
20
14
B1
Inversion
around
terminus (*)
A3
15 16
17
18
29
28
6
7
8
9
10
11
12
13
19
B3
A2
31 32 1 2
3
30
4
29
5
28
6
27
7
26
8
25
9
24
10
23
11
22
12
21
13
20
14
19
18 17 16 15
2 1 32 31 30
A3
31 32 1 2
3
30
4
29
5
28
6
27
7
26
8
25
9
24
10
23
11
22
12
21
13
20
14
19
18 17 16 15
B2
B2
Inversion
around
origin (*)
3
29
28
27
26
8
9
10
11
12
13
14
2 1 32 31 30
4
B3
15 16
17
18
19
5
6
7
25
24
23
22
21
20
B3
B1
B2
C
Escherichia coli
V. parahaemolyticus chromosome I
B
V. cholerae chromosome I
V. cholerae chromosome I
FIGURE 7.14. X-alignments. (A) Schematic model of symmetric genome inversions. The model
shows an initial speciation event, followed by a series of inversions in the different lineages (A
and B). Inversions occur between the asterisks (*). Numbers on the chromosome refer to hypothetical genes 1–32. At time point 1, the genomes of the two species are still colinear (as indicated in the scatterplot of A1 vs. B1). Between time point 1 and time point 2, each species (A
and B) undergoes a large inversion about the terminus (as indicated in the scatterplots of A1 vs.
A2 and B1 vs. B2). This results in the between-species scatterplot looking as if there have been
two nested inversions (A2 vs. B2). Between time point 2 and time point 3, each species undergoes an additional inversion (as indicated in the scatterplots of B2 vs. B3 and A2 vs. A3). This results in the between-species scatterplots beginning to resemble an X-alignment. (B) X-like alignment in dotplot of the main chromosomes of Vibrio cholerae (x-axis) and Vibrio parahaemolyticus
(y-axis). (C) A weak X-like pattern exists even when comparing more distantly related species, in
this case V. cholerae and E. coli. An X-like pattern indicates that the distance of a gene from the
origin is conserved, but the side of the origin on which it is located is not conserved.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
39. Gene Loss
bolA
tig
clpP
clpX
a
hupB
ybaU
ybaX
ybaW
ybaV
ybaE
BA CT E R I A L A N D A R C H A EA L G EN E T I CS A ND G E
ybaO
cof
•
mdlA
mdlB
amtB
tesB
ginK
RNA-ffs
ybaZ
ybaY
acrB
acrR
acrA
aefA
recR
ybaB
dnaX
apt
ybaN
priC
ybaM
htpG
adk
Chapter 7
Ancestor
Buchnera
10 kb
FIGURE 7.5. Genome reduction in Buchnera endosymbionts of aphids. A fragment of two genomes
is shown. (Top row) The putative ancestor of all aphid endosymbionts in the Buchnera genus. (Bottom row) The genome of the symbionts today. The massive amounts of gene loss are indicated by
the genes colored white in the ancestral genome that are missing from the modern genome below.
Orthologous genes between the two genomes are shown in the same color. Note the conservation
of gene order between the two genomes despite the gene loss. The direction of gene transcription
is indicated by the gene box being shifted above or below the black line.
ple, B. aphidicola APS has undergone a massive reduction in its genome since it shared
a common ancestor with E. coli (Fig. 7.5). This symbiont lives inside aphid cells where
many genes required for the free-living lifestyle of E. coli are not needed.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
41. Why Duplications Are Useful to Identify
• Allows division into orthologs and paralogs
!
• Improves functional predictions
!
• Helps identify mechanisms of duplication
!
• Can be used to study mutation processes in different
parts of a genome
!
• Lineage specific duplications may be indicative of
species’ specific adaptations
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
42. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
43. C. pneumoniae - All Paralogs
1250000
Subject Orf Position
1000000
750000
500000
250000
0
0
250000
500000
750000
Query Orf Position
1000000
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
1250000
44. C. pneumoniae Lineage-Specific Paralogs
1250000
Subject Orf Position
1000000
750000
500000
250000
0
0
250000
500000
750000
Query Orf Position
1000000
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
1250000
46. After the Genomes
• Better analysis and annotation
• Comparative genomics
• Functional genomics (Experimental analysis of gene
function on a genome scale)
• Genome-wide gene expression studies
• Proteomics
• Genome wide genetic experiments
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014