SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Lecture 10:

EVE 161:

Microbial Phylogenomics
!

Lecture #10:
Era III: Genome Sequencing
!
UC Davis, Winter 2014
Instructor: Jonathan Eisen

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

!1
Where we are going and where we have been

• Previous lecture:
! 9: rRNA Case Study - Built Environment
• Current Lecture:
! 10: Genome Sequencing
• Next Lecture:
! 11: Genome Sequencing II

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

!2
1st Genome Sequence

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Fleischma

!3
insight progress
1. Library construction

2. Random sequencing phase
(i) Sequence DNA
(15,000 sequences per Mb)

(i) Isolate DNA
–1

3. Closure phase
(i) Assemble sequences
(ii) Close gaps

–1

(ii) Fragment DNA
(iii) Edit
GGG ACTGTTC...

(iii) Clone DNA

(iv) Annotation
237

800,000 1
700,000

4. Complete
genome sequence

239

100,000

238
200,000

600,000

300,000
500,000
400,000

Figure 1 Diagram depicting the steps in a whole-genome shotgun sequencing project.

analysis of the genomes of two thermophilic bacterial species, be extensive, it is somehow constrained by phylogenetic relationAquifex aeolicus and Thermotoga maritima, revealed that 20–25% of ships. Other evidence for a ‘core’ of particular lineages comes from
the genes in these species were more similar to genes from archaea the finding of a conserved core of euryarchaeal genomes21,22 and
than those from bacteria13,14. This led to the suggestion of possible another finding that some types of gene might be more prone to gene
Slides for these species and archaeal transfer than others23. It Winter seems
extensive gene exchanges between UC Davis EVE161 Course Taught by Jonathan Eisentherefore2014 likely that horizontal gene
Complete Genome/Chromosome Progress

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
From http://genomesonline.org
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
TIGR

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Why Completeness is Important
• Improves characterization of genome features
• Gene order, replication origins
• Better comparative genomics
• Genome duplications, inversions
• Presence and absence of particular genes can be very
important
• Missing sequence might be important (e.g., centromere)
• Allows researchers to focus on biology not sequencing
• Facilitates large scale correlation studies

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
General Steps in Analysis of Complete Genomes
• Identification/prediction of genes
• Characterization of gene features
• Characterization of genome features
• Prediction of gene function
• Prediction of pathways
• Integration with known biological data
• Comparative genomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
General Steps in Analysis of Complete Genomes
• Structural Annotation
• Identification/prediction of genes
• Characterization of gene features
• Characterization of genome features
• Functional Annotation
• Prediction of gene function
• Prediction of pathways
• Integration with known biological data
• Evolutionary Annotation
• Comparative genomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Structural Annotation I: Genes in Genomes
• Protein coding genes.
! In long open reading frames
! ORFs interrupted by introns in eukaryotes
! Take up most of the genome in prokaryotes, but only a
small portion of the eukaryotic genome
• RNA-only genes
! Transfer RNA
! ribosomal RNA
! snoRNAs (guide ribosomal and transfer RNA
maturation)
! intron splicing
! guiding mRNAs to the membrane for translation
! gene regulation—this is a growing list
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Structural Annotation II: Other Features to Find
• Gene control sequences
! Promoters
! Regulatory elements
• Transposable elements, both active and defective
! DNA transposons and retrotransposons
! Many types and sizes
• Other Repeated sequences.
! Centromeres and telomeres
! Many with unknown (or no) function
• Unique sequences that have no obvious function

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
How to Find ncRNAs
• The most universal genes, such as tRNA and rRNA, are very conserved and thus
easy to detect. Finding them first removes some areas of the genome from further
consideration.
• One easy approach to finding common RNA genes is just looking for sequence
homology with related species: a BLAST search will find most of them quite easily
• Functional RNAs are characterized by secondary structure caused by base pairing
within the molecule.
• Determining the folding pattern is a matter of testing many possibilities to find the
one with the minimum free energy, which is the most stable structure.
• The free energy calculations are in turn based on experiments where short synthetic
RNA molecules are melted
• Related to this is the concept that paired regions (stems) will be conserved across
species lines even if the individual bases aren’t conserved. That is, if there is an A-U
pairing on one species, the same position might be occupied by a G-C in another
species.
• This is an example of concerted evolution: a deleterious mutation at one site is
cancelled by a compensating mutation at another site.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
RNA Structure
•

•

RNA differs from DNA in having fairly
common G-U base pairs. Also, many
functional RNAs have unusual modified
bases such as pseudouridine and inosine.
The pseudoknot, pairing between a loop
and a sequence outside its stem, is
especially difficult to detect:
computationally intense and not subject to
the normal situation that RNA base pairing
follows a nested pattern
– But pseudoknots seem to be fairly rare.

•

Essentially, RNA folding programs start
with all possible short sequences, then
build to larger ones, adding the
contribution of each structural element.

– There is an element of dynamic
programming here as well.
– And, “stochastic context-free grammars”,
something I really don’t want to approach
right now!

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Finding tRNAs

•

•

•

tRNAs have a highly conserved
structure, with 3 main stem-andloop structures that form a
cloverleaf structure, and several
conserved bases. Finding such
sequences is a matter of looking in
the DNA for the proper features
located the proper distance apart.
Looking for such sequences is
well-suited to a decision tree, a
series of steps that the sequence
must pass.
In addition, a score is kept, rating
how well the sequence passed
each step. This allows a more
stringent analysis later on, to
eliminate false positives.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Bacteria / Archaeal Protein Coding Genes
•

Bacteria use ATG as their main start codon, but GTG and TTG are also fairly common, and
a few others are occasionally used.
–

•

The stop codons are the same as in eukaryotes: TGA, TAA, TAG
–

•
•

stop codons are (almost) absolute: except for a few cases of programmed frameshifts and the use
of TGA for selenocysteine, the stop codon at the end of an ORF is the end of protein translation.

Genes can overlap by a small amount. Not much, but a few codons of overlap is common
enough so that you can’t just eliminate overlaps as impossible.
Cross-species homology works well for many genes. It is very unlikely that non-coding
sequence will be conserved.
–

•

Remember that start codons are also used internally: the actual start codon may not be the first
one in the ORF.

But, a significant minority of genes (say 20%) are unique to a given species.

Translation start signals (ribosome binding sites; Shine-Dalgarno sequences) are often
found just upstream from the start codon
–
–

however, some aren’t recognizable
genes in operons sometimes don’t always have a separate ribosome binding site for each gene

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Composition Methods
• The frequency of various codons is different in coding regions as
compared to non-coding regions.
– This extends to G-C content, dinucleotide frequencies, and other
measures of composition. Dicodons (groups of 6 bases) are often
used
– Well documented experimentally.

• The composition varies between different proteins of course, and
it is affected within a species by the amounts of the various
tRNAs present
– horizontally transferred genes can also confuse things: they tend to
have compositions that reflect their original species.
– A second group with unusual compositions are highly expressed
genes.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Eukaryotic Genes Harder to Find
•
•

Some fundamental differences between
prokaryotes and eukaryotes:
There is lots of non-coding DNA in eukaryotes.
– First step: find repeated sequences and RNA
genes
– Note that eukaryotes have 3 main RNA
polymerases. RNA polymerase 2 (pol2)
transcribes all protein-coding genes, while pol1
and pol3 transcribe various RNA-only genes.

•
•
•

most eukaryotic genes are split into exons and
introns.
Only 1 gene per transcript in eukaryotes.
No ribosome binding sites: translation starts at
the first ATG in the mRNA
– thus, in eukaryotic genomes, searching for the
transcription start site (TSS) makes sense.

•

Many fewer eukaryotic genomes have been
sequenced

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Exons

• Exon sequences can often be identified by sequence conservation,
at least roughly.
• Dicodon statistics, as was used for prokaryotes, also is useful
– eukaryotic genomes tend to contain many isochores, regions of
different GC content, and composition statistics can vary between
isochores.

• The initial and terminal exons contain untranslated regions, and
thus special methods are needed to detect them.
• Predicting splice junctions is a matter of collecting information about
the sequences surrounding each possible GT/AC pair, then running
this information through some combination of decision tree, Markov
models, discriminant analysis, or neural networks, in an attemp to
massage the data into giving a reliable score.
– In general, sites are more likely to be correct if predicted by multiple
methods
– Experimental data from ESTs can be very helpful here.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Annotation

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Classification I: GO
•

The Gene Ontology (GO) consortium (http://www.geneontology.org/) is an attempt
describe gene products with a structured controlled vocabulary, a set of invariant
terms that have a known relationship to each other.

•

Each GO term is given a number of the form GO:nnnnnnn (7 digits), as well as a term name. For
example, GO:0005102 is “receptor binding”.

•

There are 3 root terms: biological process, cellular component, and molecular function. A

gene product will probably be described by GO terms from each of these “ontologies”.
(ontology is a branch of philosophy concerned with the nature of being, and the basic
categories of being and their relationships.)
–

•

For instance, cytochrome c is described with the molecular function term “oxidoreductase
activity”, the biological process terms “oxidative phosphorylation” and “induction of cell death”,
and the cellular component terms “mitochondrial matrix” and “mitochondrial inner membrane”

The terms are arranged in a hierarchy that is a “directed acyclic graph” and not a tree.
This means simply that each term can have more than one parent term, but the
direction of parent to child (i.e. less specific to more specific) is always maintained.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Classification II: Enzyme Nomenclature
•

Enzyme functions: which reactants are converted to which products

•

Enzyme functions are given unique numbers by the Enzyme Commission.

– Across many species, the enzymes that perform a specific function are usually
evolutionarily related. However, this isn’t necessarily true. There are cases of two
entirely different enzymes evolving similar functions.
– Often, two or more gene products in a genome will have the same E.C. number.
– E.C. numbers are four integers separated by dots. The left-most number is the
least specific
– For example, the tripeptide aminopeptidases have the code "EC 3.4.11.4", whose
components indicate the following groups of enzymes:
• EC 3 enzymes are hydrolases (enzymes that use water to break up some other molecule)
• EC 3.4 are hydrolases that act on peptide bonds
• EC 3.4.11 are those hydrolases that cleave off the amino-terminal amino acid from a
polypeptide
• EC 3.4.11.4 are those that cleave off the amino-terminal end from a tripeptide

•

Top level E.C. numbers:

– E.C. 1: oxidoreductases (often dehydrogenases): electron transfer
– E.C. 2: transferases: transfer of functional groups (e.g. phosphate) between
molecules.
– E.C. 3: hydrolases: splitting a molecule by adding water to a bond.
– E.C. 4: lyases: non-hydrolytic addition or removal of groups from a molecule
– E.C. 5: isomerases: rearrangements of atoms within a molecule
– E.C. 6: ligases: joining two molecules using energy from ATP

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction

•
•
•
•
•
•

BLAST searches
HMM models of specific genes or gene families (Pfam, TIGRfam,
FIGfam).
Sequence motifs and domains. If the gene is not a good match to
previously known genes, these provide useful clues.
Cellular location predictions, especially for transmembrane proteins.
Genomic neighbors, especially in bacteria, where related functions
are often found together in operons and divergons (genes
transcribed in opposite directions that use a common control region).
Biochemical pathway/subsystem information. If an organism has
most of the genes needed to perform a function, any missing
functions are probably present too.
– Also, experimental data about an organism’s capacities can be used to
decide whether the relevant functions are present in the genome.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction II: Membrane Spanning
•

Integral membrane proteins contain amino acid
sequences that go through the membrane one or
several times.
– There are also peripheral membrane proteins that stick
to the hydrophilic head groups by ionic and polar
interactions
– There are also some that have covalently bound
hydrophobic groups, such as myristoylate, a 14 carbon
saturated fatty acid that is attached to the N-terminal
amino group.

•

There are 2 main protein structures that cross
membranes.
– Most are alpha helices, and in proteins that span
multiple times, these alpha helices are packed together
in a coiled-coil. Length = 15-30 amino acids.
– Less commonly, there are proteins with membrane
spanning “beta barrels”, composed of beta sheets
wrapped into a cylinder. An example: porins, which
transport water across the membrane.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction by Phylogeny
• Key step in genome projects
• More accurate predictions help guide experimental and
computational analyses
• Many diverse approaches
• All improved both by “phylogenomic” type analyses that
integrate evolutionary reconstructions and understanding
of how new functions evolve

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction
• Identification of motifs
! Short regions of sequence similarity that are indicative
of general activity
! e.g., ATP binding
• Homology/similarity based methods
! Gene sequence is searched against a databases of
other sequences
! If significant similar genes are found, their functional
information is used
• Problem
! Genes frequently have similarity to hundreds of motifs
and multiple genes, not all with the same function
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Helicobacter pylori

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
H. pylori genome - 1997

“The ability of H. pylori to
perform mismatch repair is
suggested by the presence of
methyl transferases, mutS
and uvrD. However,
orthologues of MutH and
MutL were not identified.”

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
MutL ??

From http://asajj.roswellpark.org/huberman/dna_repair/mmr.html
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Phylogenetic Tree of MutS Family
Yeast
Human
Celeg

Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth
mSaco

Yeast
Human
Mouse
Arath
Arath
Human
Mouse
Spombe
Yeast
Yeast
Spombe

Yeast
Celeg
Human

Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath

Aquae
Trepa
Chltr
Deira
Theaq
BacsuBorbu
Thema
SynspStrpy
Ecoli
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Neigo

Based on Eisen,
1998 Nucl Acids
30
Res 26: 4291-4300.
MutS Subfamilies
MSH5
Yeast
Human
Celeg

MSH6

MSH3
MSH1

MutS2

Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth
mSaco

Yeast
Human
Mouse
Arath

Yeast
Celeg

MSH4

Human

Arath
Human
Mouse
Spombe
Yeast

Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath

Yeast
Spombe

Aquae
Chltr
Deira
Theaq
Thema

MSH2

Trepa

BacsuBorbu
SynspStrpy
Ecoli
Neigo

MutS1

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Based on Eisen,
1998 Nucl Acids
31
Res 26: 4291-4300.
Overlaying Functions onto Tree
MutS2
MSH5

Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth

Yeast
Human
Celeg

MSH6

mSaco
Yeast
Human
Mouse
Arath

MSH3
MSH1

MSH4

Yeast
Celeg

Human

Arath
Human
Mouse
Spombe
Yeast

Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath

Yeast
Spombe

Aquae
Chltr
Deira
Theaq
Thema

Trepa

BacsuBorbu
Synsp
Strpy
Ecoli
Neigo

MutS1

MSH2

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Based on Eisen,
1998 Nucl Acids
32
Res 26: 4291-4300.
MutS Subfamilies
•
•
•
•
•

MutS1	

MSH1	

MSH2	

MSH3	

MSH6	


	

	

	

	

	


Bacterial MMR	

Euk - mitochondrial MMR	

Euk - all MMR in nucleus	

Euk - loop MMR in nucleus	

Euk - base:base MMR in nucleus	


	

	

	


Bacterial - function unknown	

Euk - meiotic crossing-over	

Euk - meiotic crossing-over

!
• MutS2	

• MSH4	

• MSH5	


TIGR

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction Using Tree
MSH5 - Meiotic Crossing Over

Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth

Yeast
Human
Celeg

MSH6 - Nuclear 

Repair
Of Mismatches

MutS2 - Unknown Functions

mSaco
Yeast
Human
Mouse
Arath

Yeast
Celeg
Human

Arath
MSH3 - Nuclear 
 Human
Mouse
RepairOf Loops Spombe
Yeast

MSH1
Mitochondrial
Repair

MSH4 - Meiotic Crossing
Over

Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath

Yeast
Spombe

Aquae
Chltr
Deira
Theaq
Thema

MSH2 - Eukaryotic Nuclear
Mismatch and Loop Repair

Trepa

BacsuBorbu
Synsp
Strpy
Ecoli
Neigo

Slides for MutS1 - EVE161 Course Taught by Jonathan Eisen Winter 2014
UC Davis Bacterial Mismatch and Loop Repair

Based on Eisen,
1998 Nucl Acids
34
Res 26: 4291-4300.
Table 3. Presence of MutS Homologs in Complete Genomes Sequences
Species

# of MutS
Homologs

Which
Subfamilies?

MutL
Homologs

Bacteria
Escherichia coli K12
Haemophilus influenzae Rd KW20
Neisseria gonorrhoeae
Helicobacter pylori 26695
Mycoplasma genitalium G-37
Mycoplasma pneumoniae M129
Bacillus subtilis 169
Streptococcus pyogenes
Mycobacterium tuberculosis
Synechocystis sp. PCC6803
Treponema pallidum Nichols
Borrelia burgdorferi B31
Aquifex aeolicus
Deinococcus radiodurans R1

1
1
1
1
2
2
2
1
2
2
2

MutS1
MutS1
MutS1
MutS2
MutS1,MutS2
MutS1,MutS2
MutS1,MutS2
MutS1
MutS1,MutS2
MutS1,MutS2
MutS1,MutS2

1
1
1
1
1
1
1
1
1
1

Archaea
Archaeoglobus fulgidus VC-16, DSM4304
Methanococcus janasscii DSM 2661
Methanobacterium thermoautotrophicum ΔH

1

MutS2

-

Eukaryotes
Saccharomyces cerevisiae
Homo sapiens

6
5

MSH1-6
MSH2-6

3+
3+

TIGR

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Blast Search of H. pylori “MutS”

Sequences producing significant alignments:
sp|P73625|MUTS_SYNY3
sp|P74926|MUTS_THEMA
sp|P44834|MUTS_HAEIN
sp|P10339|MUTS_SALTY
sp|O66652|MUTS_AQUAE
sp|P23909|MUTS_ECOLI

DNA
DNA
DNA
DNA
DNA
DNA

MISMATCH
MISMATCH
MISMATCH
MISMATCH
MISMATCH
MISMATCH

REPAIR
REPAIR
REPAIR
REPAIR
REPAIR
REPAIR

Score
E
(bits) Value

PROTEIN
PROTEIN
PROTEIN
PROTEIN
PROTEIN
PROTEIN

117
69
64
62
57
57

• Blast search pulls up Syn. sp MutS#2 with much higher p value
than other MutS homologs
• Based on this TIGR predicted this species had mismatch repair

Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

3e-25
1e-10
3e-09
2e-08
4e-07
4e-07
High Mutation Rate in H. pylori

Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Phylogenomics
PHYLOGENENETIC PREDICTION OF GENE FUNCTION

EXAMPLE A

METHOD

EXAMPLE B

2A

CHOOSE GENE(S) OF INTEREST

5

3A
2B
1A 2A 1B 3B

IDENTIFY HOMOLOGS

2

1 3 4
5
6

ALIGN SEQUENCES
1A

2A 3A 1B

2B

1

2

3

4

5

6

1

3B

2

3

4

5

6

3

4

5

6

4

5

6

CALCULATE GENE TREE
Duplication?

1A

2A 3A 1B

2B

3B

OVERLAY KNOWN
FUNCTIONS ONTO TREE
Duplication?

1A

2A 3A 1B

2B

1

3B

2

INFER LIKELY FUNCTION
OF GENE(S) OF INTEREST
Ambiguous
Duplication?

Species 1
1A 1B

Species 2
2A 2B

Species 3
3A 3B

1

2

3

ACTUAL EVOLUTION
(ASSUMED TO BE UNKNOWN)

Duplication

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Based on Eisen, 1998
Genome Res 8: 163-167.
1

2

4
3
5

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

6
Chemosynthetic Symbionts

Eisen et al. 1992
Eisen et al. 1992. J. Bact.174: 3416

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Carboxydothermus hydrogenoformans
• Isolated from a Russian hotspring
• Thermophile (grows at 80°C)
• Anaerobic
• Grows very efficiently on CO (Carbon
Monoxide)
• Produces hydrogen gas
• Low GC Gram positive (Firmicute)
• Genome Determined (Wu et al. 2005 PLoS
Genetics 1: e65. )

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Homologs of Sporulation Genes

Wu et al. 2005 PLoS
Genetics 1: e65.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Carboxydothermus sporulates

Wu et al. 2005 PLoS Genetics 1: e65.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Non-Homology Predictions:
Phylogenetic Profiling

• Step 1: Search all genes in
organisms of interest against all
other genomes

!
• Ask: Yes or No, is each gene found
in each other species

!
• Cluster genes by distribution
patterns (profiles)

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Sporulation Gene Profile

Wu et al. 2005 PLoS Genetics 1: e65.

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
B. subtilis new sporulation genes

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Functional Prediction III: Colocalization

•

Operon structure is often
maintained over fairly large
taxonomic regions.
–
–

•

Sometimes gene order is altered,
and sometimes one or more
enzymes are missing.
But in general, this phenomenon
allows recognition or verification
that widely diverged enzymes do
in fact have the same function.

This is an operon that contains
part of the glycolytic pathway.
–
–
–
–
–
–

1: phosphoclycerate mutase
2: triosephosphate isomerase
3: enolase
4: phosphoglycerate kinase
5: glyceraldehyde 3-phosphate
dehydrogenase
6: central glycolytic gene regulator

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Metabolic Predictions

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Comparative Genomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

!50
Using the Core

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

!51
between even related species.
Our molecular picture of evolution for the past 20 years has been
dominated by the small-subunit ribosomal RNA phylogentic tree

analysed. Analyses of complete genome sequences have led to many
recent suggestions that the extent of horizontal gene exchange is
much greater than was previously realized10–12. For example, an

Table 2 Genome features from 24 microbial genome sequencing projects
Organism

Genome
size (Mbp)

No. of ORFs
(% coding)

Unknown
function

Aeropyrum pernix K1

1.67

1,885

(89%)

A. aeolicus VF5

1.50

1,749

(93%)

663

(44%)

A. fulgidus

2.18

2,437

(92%)

1,315

B. subtilis

4.20

4,779

(87%)

1,722

B. burgdorferi

1.44

1,738

(88%)

Chlamydia pneumoniae AR39

1.23

1,134

(90%)

Chlamydia trachomatis MoPn

1.07

936

C. trachomatis serovar D

1.04

928

Deinococcus radiodurans

3.28

E. coli K-12-MG1655

4.60

H. influenzae
H. pylori 26695

Unique
ORFs
407

(27%)

(54%)

641

(26%)

(42%)

1,053

(26%)

1,132

(65%)

682

(39%)

543

(48%)

262

(23%)

(91%)

353

(38%)

77

(8%)

(92%)

290

(32%)

255

(29%)

3,187

(91%)

1,715

(54%)

1,001

(31%)

5,295

(88%)

1,632

(38%)

1,114

(26%)

1.83

1,738

(88%)

592

(35%)

237

(14%)

1.66

1,589

(91%)

744

(45%)

539

(33%)

Methanobacterium thermotautotrophicum

1.75

2,008

(90%)

1,010

(54%)

496

(27%)

Methanococcus jannaschii

1.66

1,783

(87%)

1,076

(62%)

525

(30%)

M. tuberculosis CSU#93

4.41

4,275

(92%)

1,521

(39%)

606

(15%)

M. genitalium

0.58

483

(91%)

173

(37%)

7

(2%)

M. pneumoniae

0.81

680

(89%)

248

(37%)

67

(10%)

N. meningitidis MC58

2.24

2,155

(83%)

856

(40%)

517

(24%)

Pyrococcus horikoshii OT3

1.74

1,994

(91%)

859

(42%)

453

(22%)

Rickettsia prowazekii Madrid E

1.11

878

(75%)

311

(37%)

209

(25%)

Synechocystis sp.

3.57

4,003

(87%)

2,384

(75%)

1,426

(45%)

T. maritima MSB8

1.86

1,879

(95%)

863

(46%)

373

(26%)

T. pallidum

1.14

1,039

(93%)

461

(44%)

280

(27%)

Vibrio cholerae El Tor N1696

800

4.03

3,890

(88%)

1,806

(46%)

934

(24%)

50.60

52,462

(89%)

22,358

(43%)

12,161

(23%)

© 2000 Macmillan Magazines Ltd

NATURE | VOL 406 | 17 AUGUST 2000 | www.nature.com

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
After the Genomes
• Better analysis and annotation
• Comparative genomics
• Functional genomics (Experimental analysis of gene
function on a genome scale)
• Genome-wide gene expression studies
• Proteomics
• Genome wide genetic experiments

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Weitere ähnliche Inhalte

Was ist angesagt?

Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNA
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNAMicrobial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNA
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNAJonathan Eisen
 
UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingJonathan Eisen
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
 
UC Davis EVE161 Lecture 13 by @phylogenomics
UC Davis EVE161 Lecture 13 by @phylogenomicsUC Davis EVE161 Lecture 13 by @phylogenomics
UC Davis EVE161 Lecture 13 by @phylogenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Jonathan Eisen
 
EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeJonathan Eisen
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsJonathan Eisen
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15Jonathan Eisen
 
UC Davis EVE161 Lecture 18 by @phylogenomics
 UC Davis EVE161 Lecture 18 by @phylogenomics UC Davis EVE161 Lecture 18 by @phylogenomics
UC Davis EVE161 Lecture 18 by @phylogenomicsJonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Jonathan Eisen
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsJonathan Eisen
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14Jonathan Eisen
 
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying Microbes
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying MicrobesBIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying Microbes
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying MicrobesJonathan Eisen
 

Was ist angesagt? (20)

Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNA
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNAMicrobial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNA
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNA
 
UC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomicsUC Davis EVE161 Lecture 15 by @phylogenomics
UC Davis EVE161 Lecture 15 by @phylogenomics
 
EVE161 Lecture 2
EVE161 Lecture 2EVE161 Lecture 2
EVE161 Lecture 2
 
EVE 161 Lecture 6
EVE 161 Lecture 6EVE 161 Lecture 6
EVE 161 Lecture 6
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
UC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomicsUC Davis EVE161 Lecture 14 by @phylogenomics
UC Davis EVE161 Lecture 14 by @phylogenomics
 
EVE161 Lecture 3
EVE161 Lecture 3EVE161 Lecture 3
EVE161 Lecture 3
 
UC Davis EVE161 Lecture 13 by @phylogenomics
UC Davis EVE161 Lecture 13 by @phylogenomicsUC Davis EVE161 Lecture 13 by @phylogenomics
UC Davis EVE161 Lecture 13 by @phylogenomics
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
Microbial Phylogenomics (EVE161) Class 15: Shotgun Metagenomics
 
EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9EveMicrobial Phylogenomics (EVE161) Class 9
EveMicrobial Phylogenomics (EVE161) Class 9
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
 
UC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomicsUC Davis EVE161 Lecture 9 by @phylogenomics
UC Davis EVE161 Lecture 9 by @phylogenomics
 
EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15EVE 161 Winter 2018 Class 15
EVE 161 Winter 2018 Class 15
 
UC Davis EVE161 Lecture 18 by @phylogenomics
 UC Davis EVE161 Lecture 18 by @phylogenomics UC Davis EVE161 Lecture 18 by @phylogenomics
UC Davis EVE161 Lecture 18 by @phylogenomics
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
 
EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14EVE 161 Winter 2018 Class 14
EVE 161 Winter 2018 Class 14
 
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying Microbes
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying MicrobesBIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying Microbes
BIS2C. Biodiversity and the Tree of Life. 2014. L10. Studying Microbes
 

Ähnlich wie UC Davis EVE161 Lecture 10 by @phylogenomics

Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Monica Munoz-Torres
 
Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisMonica Munoz-Torres
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKMonica Munoz-Torres
 
Introduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityIntroduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earthEmmanuel Aguon
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaGoodness
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaZANELE FORTUNATE
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaLondeka Mkhize
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnanobantu pulati
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnascience91
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaPuleng Lebyane
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaNedzamba Pfano
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaLuvo Maqungo
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnanokuthula hlubi
 

Ähnlich wie UC Davis EVE161 Lecture 10 by @phylogenomics (20)

Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
 
Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinis
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTK
 
Microbial genetics notes
Microbial genetics notesMicrobial genetics notes
Microbial genetics notes
 
Genomics,proteomics and comparative genomics
Genomics,proteomics and comparative genomicsGenomics,proteomics and comparative genomics
Genomics,proteomics and comparative genomics
 
anatomy
anatomyanatomy
anatomy
 
Apolo Taller en BIOS
Apolo Taller en BIOS Apolo Taller en BIOS
Apolo Taller en BIOS
 
Introduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityIntroduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research Community
 
Genome Curation using Apollo
Genome Curation using ApolloGenome Curation using Apollo
Genome Curation using Apollo
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earth
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 
Unit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dnaUnit 1 genetics nucleic acids dna
Unit 1 genetics nucleic acids dna
 

Mehr von Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 

Mehr von Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Kürzlich hochgeladen

ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 

Kürzlich hochgeladen (20)

ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 

UC Davis EVE161 Lecture 10 by @phylogenomics

  • 1. Lecture 10: EVE 161:
 Microbial Phylogenomics ! Lecture #10: Era III: Genome Sequencing ! UC Davis, Winter 2014 Instructor: Jonathan Eisen Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !1
  • 2. Where we are going and where we have been • Previous lecture: ! 9: rRNA Case Study - Built Environment • Current Lecture: ! 10: Genome Sequencing • Next Lecture: ! 11: Genome Sequencing II Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !2
  • 3. 1st Genome Sequence Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 Fleischma !3
  • 4. insight progress 1. Library construction 2. Random sequencing phase (i) Sequence DNA (15,000 sequences per Mb) (i) Isolate DNA –1 3. Closure phase (i) Assemble sequences (ii) Close gaps –1 (ii) Fragment DNA (iii) Edit GGG ACTGTTC... (iii) Clone DNA (iv) Annotation 237 800,000 1 700,000 4. Complete genome sequence 239 100,000 238 200,000 600,000 300,000 500,000 400,000 Figure 1 Diagram depicting the steps in a whole-genome shotgun sequencing project. analysis of the genomes of two thermophilic bacterial species, be extensive, it is somehow constrained by phylogenetic relationAquifex aeolicus and Thermotoga maritima, revealed that 20–25% of ships. Other evidence for a ‘core’ of particular lineages comes from the genes in these species were more similar to genes from archaea the finding of a conserved core of euryarchaeal genomes21,22 and than those from bacteria13,14. This led to the suggestion of possible another finding that some types of gene might be more prone to gene Slides for these species and archaeal transfer than others23. It Winter seems extensive gene exchanges between UC Davis EVE161 Course Taught by Jonathan Eisentherefore2014 likely that horizontal gene
  • 5. Complete Genome/Chromosome Progress Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 6. From http://genomesonline.org Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 7. TIGR Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 8. Why Completeness is Important • Improves characterization of genome features • Gene order, replication origins • Better comparative genomics • Genome duplications, inversions • Presence and absence of particular genes can be very important • Missing sequence might be important (e.g., centromere) • Allows researchers to focus on biology not sequencing • Facilitates large scale correlation studies Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 9. General Steps in Analysis of Complete Genomes • Identification/prediction of genes • Characterization of gene features • Characterization of genome features • Prediction of gene function • Prediction of pathways • Integration with known biological data • Comparative genomics Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 10. General Steps in Analysis of Complete Genomes • Structural Annotation • Identification/prediction of genes • Characterization of gene features • Characterization of genome features • Functional Annotation • Prediction of gene function • Prediction of pathways • Integration with known biological data • Evolutionary Annotation • Comparative genomics Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 11. Structural Annotation I: Genes in Genomes • Protein coding genes. ! In long open reading frames ! ORFs interrupted by introns in eukaryotes ! Take up most of the genome in prokaryotes, but only a small portion of the eukaryotic genome • RNA-only genes ! Transfer RNA ! ribosomal RNA ! snoRNAs (guide ribosomal and transfer RNA maturation) ! intron splicing ! guiding mRNAs to the membrane for translation ! gene regulation—this is a growing list Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 12. Structural Annotation II: Other Features to Find • Gene control sequences ! Promoters ! Regulatory elements • Transposable elements, both active and defective ! DNA transposons and retrotransposons ! Many types and sizes • Other Repeated sequences. ! Centromeres and telomeres ! Many with unknown (or no) function • Unique sequences that have no obvious function Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 13. How to Find ncRNAs • The most universal genes, such as tRNA and rRNA, are very conserved and thus easy to detect. Finding them first removes some areas of the genome from further consideration. • One easy approach to finding common RNA genes is just looking for sequence homology with related species: a BLAST search will find most of them quite easily • Functional RNAs are characterized by secondary structure caused by base pairing within the molecule. • Determining the folding pattern is a matter of testing many possibilities to find the one with the minimum free energy, which is the most stable structure. • The free energy calculations are in turn based on experiments where short synthetic RNA molecules are melted • Related to this is the concept that paired regions (stems) will be conserved across species lines even if the individual bases aren’t conserved. That is, if there is an A-U pairing on one species, the same position might be occupied by a G-C in another species. • This is an example of concerted evolution: a deleterious mutation at one site is cancelled by a compensating mutation at another site. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 14. RNA Structure • • RNA differs from DNA in having fairly common G-U base pairs. Also, many functional RNAs have unusual modified bases such as pseudouridine and inosine. The pseudoknot, pairing between a loop and a sequence outside its stem, is especially difficult to detect: computationally intense and not subject to the normal situation that RNA base pairing follows a nested pattern – But pseudoknots seem to be fairly rare. • Essentially, RNA folding programs start with all possible short sequences, then build to larger ones, adding the contribution of each structural element. – There is an element of dynamic programming here as well. – And, “stochastic context-free grammars”, something I really don’t want to approach right now! Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 15. Finding tRNAs • • • tRNAs have a highly conserved structure, with 3 main stem-andloop structures that form a cloverleaf structure, and several conserved bases. Finding such sequences is a matter of looking in the DNA for the proper features located the proper distance apart. Looking for such sequences is well-suited to a decision tree, a series of steps that the sequence must pass. In addition, a score is kept, rating how well the sequence passed each step. This allows a more stringent analysis later on, to eliminate false positives. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 16. Bacteria / Archaeal Protein Coding Genes • Bacteria use ATG as their main start codon, but GTG and TTG are also fairly common, and a few others are occasionally used. – • The stop codons are the same as in eukaryotes: TGA, TAA, TAG – • • stop codons are (almost) absolute: except for a few cases of programmed frameshifts and the use of TGA for selenocysteine, the stop codon at the end of an ORF is the end of protein translation. Genes can overlap by a small amount. Not much, but a few codons of overlap is common enough so that you can’t just eliminate overlaps as impossible. Cross-species homology works well for many genes. It is very unlikely that non-coding sequence will be conserved. – • Remember that start codons are also used internally: the actual start codon may not be the first one in the ORF. But, a significant minority of genes (say 20%) are unique to a given species. Translation start signals (ribosome binding sites; Shine-Dalgarno sequences) are often found just upstream from the start codon – – however, some aren’t recognizable genes in operons sometimes don’t always have a separate ribosome binding site for each gene Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 17. Composition Methods • The frequency of various codons is different in coding regions as compared to non-coding regions. – This extends to G-C content, dinucleotide frequencies, and other measures of composition. Dicodons (groups of 6 bases) are often used – Well documented experimentally. • The composition varies between different proteins of course, and it is affected within a species by the amounts of the various tRNAs present – horizontally transferred genes can also confuse things: they tend to have compositions that reflect their original species. – A second group with unusual compositions are highly expressed genes. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 18. Eukaryotic Genes Harder to Find • • Some fundamental differences between prokaryotes and eukaryotes: There is lots of non-coding DNA in eukaryotes. – First step: find repeated sequences and RNA genes – Note that eukaryotes have 3 main RNA polymerases. RNA polymerase 2 (pol2) transcribes all protein-coding genes, while pol1 and pol3 transcribe various RNA-only genes. • • • most eukaryotic genes are split into exons and introns. Only 1 gene per transcript in eukaryotes. No ribosome binding sites: translation starts at the first ATG in the mRNA – thus, in eukaryotic genomes, searching for the transcription start site (TSS) makes sense. • Many fewer eukaryotic genomes have been sequenced Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 19. Exons • Exon sequences can often be identified by sequence conservation, at least roughly. • Dicodon statistics, as was used for prokaryotes, also is useful – eukaryotic genomes tend to contain many isochores, regions of different GC content, and composition statistics can vary between isochores. • The initial and terminal exons contain untranslated regions, and thus special methods are needed to detect them. • Predicting splice junctions is a matter of collecting information about the sequences surrounding each possible GT/AC pair, then running this information through some combination of decision tree, Markov models, discriminant analysis, or neural networks, in an attemp to massage the data into giving a reliable score. – In general, sites are more likely to be correct if predicted by multiple methods – Experimental data from ESTs can be very helpful here. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 20. Functional Annotation Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 21. Functional Classification I: GO • The Gene Ontology (GO) consortium (http://www.geneontology.org/) is an attempt describe gene products with a structured controlled vocabulary, a set of invariant terms that have a known relationship to each other. • Each GO term is given a number of the form GO:nnnnnnn (7 digits), as well as a term name. For example, GO:0005102 is “receptor binding”. • There are 3 root terms: biological process, cellular component, and molecular function. A gene product will probably be described by GO terms from each of these “ontologies”. (ontology is a branch of philosophy concerned with the nature of being, and the basic categories of being and their relationships.) – • For instance, cytochrome c is described with the molecular function term “oxidoreductase activity”, the biological process terms “oxidative phosphorylation” and “induction of cell death”, and the cellular component terms “mitochondrial matrix” and “mitochondrial inner membrane” The terms are arranged in a hierarchy that is a “directed acyclic graph” and not a tree. This means simply that each term can have more than one parent term, but the direction of parent to child (i.e. less specific to more specific) is always maintained. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 22. Functional Classification II: Enzyme Nomenclature • Enzyme functions: which reactants are converted to which products • Enzyme functions are given unique numbers by the Enzyme Commission. – Across many species, the enzymes that perform a specific function are usually evolutionarily related. However, this isn’t necessarily true. There are cases of two entirely different enzymes evolving similar functions. – Often, two or more gene products in a genome will have the same E.C. number. – E.C. numbers are four integers separated by dots. The left-most number is the least specific – For example, the tripeptide aminopeptidases have the code "EC 3.4.11.4", whose components indicate the following groups of enzymes: • EC 3 enzymes are hydrolases (enzymes that use water to break up some other molecule) • EC 3.4 are hydrolases that act on peptide bonds • EC 3.4.11 are those hydrolases that cleave off the amino-terminal amino acid from a polypeptide • EC 3.4.11.4 are those that cleave off the amino-terminal end from a tripeptide • Top level E.C. numbers: – E.C. 1: oxidoreductases (often dehydrogenases): electron transfer – E.C. 2: transferases: transfer of functional groups (e.g. phosphate) between molecules. – E.C. 3: hydrolases: splitting a molecule by adding water to a bond. – E.C. 4: lyases: non-hydrolytic addition or removal of groups from a molecule – E.C. 5: isomerases: rearrangements of atoms within a molecule – E.C. 6: ligases: joining two molecules using energy from ATP Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 23. Functional Prediction • • • • • • BLAST searches HMM models of specific genes or gene families (Pfam, TIGRfam, FIGfam). Sequence motifs and domains. If the gene is not a good match to previously known genes, these provide useful clues. Cellular location predictions, especially for transmembrane proteins. Genomic neighbors, especially in bacteria, where related functions are often found together in operons and divergons (genes transcribed in opposite directions that use a common control region). Biochemical pathway/subsystem information. If an organism has most of the genes needed to perform a function, any missing functions are probably present too. – Also, experimental data about an organism’s capacities can be used to decide whether the relevant functions are present in the genome. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 24. Functional Prediction II: Membrane Spanning • Integral membrane proteins contain amino acid sequences that go through the membrane one or several times. – There are also peripheral membrane proteins that stick to the hydrophilic head groups by ionic and polar interactions – There are also some that have covalently bound hydrophobic groups, such as myristoylate, a 14 carbon saturated fatty acid that is attached to the N-terminal amino group. • There are 2 main protein structures that cross membranes. – Most are alpha helices, and in proteins that span multiple times, these alpha helices are packed together in a coiled-coil. Length = 15-30 amino acids. – Less commonly, there are proteins with membrane spanning “beta barrels”, composed of beta sheets wrapped into a cylinder. An example: porins, which transport water across the membrane. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 25. Functional Prediction by Phylogeny • Key step in genome projects • More accurate predictions help guide experimental and computational analyses • Many diverse approaches • All improved both by “phylogenomic” type analyses that integrate evolutionary reconstructions and understanding of how new functions evolve Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 26. Functional Prediction • Identification of motifs ! Short regions of sequence similarity that are indicative of general activity ! e.g., ATP binding • Homology/similarity based methods ! Gene sequence is searched against a databases of other sequences ! If significant similar genes are found, their functional information is used • Problem ! Genes frequently have similarity to hundreds of motifs and multiple genes, not all with the same function Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 27. Helicobacter pylori Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 28. H. pylori genome - 1997 “The ability of H. pylori to perform mismatch repair is suggested by the presence of methyl transferases, mutS and uvrD. However, orthologues of MutH and MutL were not identified.” Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 29. MutL ?? From http://asajj.roswellpark.org/huberman/dna_repair/mmr.html Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 30. Phylogenetic Tree of MutS Family Yeast Human Celeg Aquae Strpy Bacsu Synsp Deira Helpy Borbu Metth mSaco Yeast Human Mouse Arath Arath Human Mouse Spombe Yeast Yeast Spombe Yeast Celeg Human Fly Xenla Rat Mouse Human Yeast Neucr Arath Aquae Trepa Chltr Deira Theaq BacsuBorbu Thema SynspStrpy Ecoli Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 Neigo Based on Eisen, 1998 Nucl Acids 30 Res 26: 4291-4300.
  • 32. Overlaying Functions onto Tree MutS2 MSH5 Aquae Strpy Bacsu Synsp Deira Helpy Borbu Metth Yeast Human Celeg MSH6 mSaco Yeast Human Mouse Arath MSH3 MSH1 MSH4 Yeast Celeg Human Arath Human Mouse Spombe Yeast Fly Xenla Rat Mouse Human Yeast Neucr Arath Yeast Spombe Aquae Chltr Deira Theaq Thema Trepa BacsuBorbu Synsp Strpy Ecoli Neigo MutS1 MSH2 Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 Based on Eisen, 1998 Nucl Acids 32 Res 26: 4291-4300.
  • 33. MutS Subfamilies • • • • • MutS1 MSH1 MSH2 MSH3 MSH6 Bacterial MMR Euk - mitochondrial MMR Euk - all MMR in nucleus Euk - loop MMR in nucleus Euk - base:base MMR in nucleus Bacterial - function unknown Euk - meiotic crossing-over Euk - meiotic crossing-over ! • MutS2 • MSH4 • MSH5 TIGR Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 34. Functional Prediction Using Tree MSH5 - Meiotic Crossing Over Aquae Strpy Bacsu Synsp Deira Helpy Borbu Metth Yeast Human Celeg MSH6 - Nuclear 
 Repair Of Mismatches MutS2 - Unknown Functions mSaco Yeast Human Mouse Arath Yeast Celeg Human Arath MSH3 - Nuclear 
 Human Mouse RepairOf Loops Spombe Yeast MSH1 Mitochondrial Repair MSH4 - Meiotic Crossing Over Fly Xenla Rat Mouse Human Yeast Neucr Arath Yeast Spombe Aquae Chltr Deira Theaq Thema MSH2 - Eukaryotic Nuclear Mismatch and Loop Repair Trepa BacsuBorbu Synsp Strpy Ecoli Neigo Slides for MutS1 - EVE161 Course Taught by Jonathan Eisen Winter 2014 UC Davis Bacterial Mismatch and Loop Repair Based on Eisen, 1998 Nucl Acids 34 Res 26: 4291-4300.
  • 35. Table 3. Presence of MutS Homologs in Complete Genomes Sequences Species # of MutS Homologs Which Subfamilies? MutL Homologs Bacteria Escherichia coli K12 Haemophilus influenzae Rd KW20 Neisseria gonorrhoeae Helicobacter pylori 26695 Mycoplasma genitalium G-37 Mycoplasma pneumoniae M129 Bacillus subtilis 169 Streptococcus pyogenes Mycobacterium tuberculosis Synechocystis sp. PCC6803 Treponema pallidum Nichols Borrelia burgdorferi B31 Aquifex aeolicus Deinococcus radiodurans R1 1 1 1 1 2 2 2 1 2 2 2 MutS1 MutS1 MutS1 MutS2 MutS1,MutS2 MutS1,MutS2 MutS1,MutS2 MutS1 MutS1,MutS2 MutS1,MutS2 MutS1,MutS2 1 1 1 1 1 1 1 1 1 1 Archaea Archaeoglobus fulgidus VC-16, DSM4304 Methanococcus janasscii DSM 2661 Methanobacterium thermoautotrophicum ΔH 1 MutS2 - Eukaryotes Saccharomyces cerevisiae Homo sapiens 6 5 MSH1-6 MSH2-6 3+ 3+ TIGR Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 36. Blast Search of H. pylori “MutS” Sequences producing significant alignments: sp|P73625|MUTS_SYNY3 sp|P74926|MUTS_THEMA sp|P44834|MUTS_HAEIN sp|P10339|MUTS_SALTY sp|O66652|MUTS_AQUAE sp|P23909|MUTS_ECOLI DNA DNA DNA DNA DNA DNA MISMATCH MISMATCH MISMATCH MISMATCH MISMATCH MISMATCH REPAIR REPAIR REPAIR REPAIR REPAIR REPAIR Score E (bits) Value PROTEIN PROTEIN PROTEIN PROTEIN PROTEIN PROTEIN 117 69 64 62 57 57 • Blast search pulls up Syn. sp MutS#2 with much higher p value than other MutS homologs • Based on this TIGR predicted this species had mismatch repair Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 3e-25 1e-10 3e-09 2e-08 4e-07 4e-07
  • 37. High Mutation Rate in H. pylori Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 38. Phylogenomics PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 2B 1A 2A 1B 3B IDENTIFY HOMOLOGS 2 1 3 4 5 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 1 2 3 4 5 6 1 3B 2 3 4 5 6 3 4 5 6 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 1A 2A 3A 1B 2B 1 3B 2 INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 1A 1B Species 2 2A 2B Species 3 3A 3B 1 2 3 ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Duplication Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 Based on Eisen, 1998 Genome Res 8: 163-167.
  • 39. 1 2 4 3 5 Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 6
  • 40. Chemosynthetic Symbionts Eisen et al. 1992 Eisen et al. 1992. J. Bact.174: 3416 Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 41. Carboxydothermus hydrogenoformans • Isolated from a Russian hotspring • Thermophile (grows at 80°C) • Anaerobic • Grows very efficiently on CO (Carbon Monoxide) • Produces hydrogen gas • Low GC Gram positive (Firmicute) • Genome Determined (Wu et al. 2005 PLoS Genetics 1: e65. ) Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 42. Homologs of Sporulation Genes Wu et al. 2005 PLoS Genetics 1: e65. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 43. Carboxydothermus sporulates Wu et al. 2005 PLoS Genetics 1: e65. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 44. Non-Homology Predictions: Phylogenetic Profiling • Step 1: Search all genes in organisms of interest against all other genomes ! • Ask: Yes or No, is each gene found in each other species ! • Cluster genes by distribution patterns (profiles) Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 45. Sporulation Gene Profile Wu et al. 2005 PLoS Genetics 1: e65. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 46. B. subtilis new sporulation genes Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 47. Functional Prediction III: Colocalization • Operon structure is often maintained over fairly large taxonomic regions. – – • Sometimes gene order is altered, and sometimes one or more enzymes are missing. But in general, this phenomenon allows recognition or verification that widely diverged enzymes do in fact have the same function. This is an operon that contains part of the glycolytic pathway. – – – – – – 1: phosphoclycerate mutase 2: triosephosphate isomerase 3: enolase 4: phosphoglycerate kinase 5: glyceraldehyde 3-phosphate dehydrogenase 6: central glycolytic gene regulator Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 48. Metabolic Predictions Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 49. Comparative Genomics Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 50. Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !50
  • 51. Using the Core Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !51
  • 52. between even related species. Our molecular picture of evolution for the past 20 years has been dominated by the small-subunit ribosomal RNA phylogentic tree analysed. Analyses of complete genome sequences have led to many recent suggestions that the extent of horizontal gene exchange is much greater than was previously realized10–12. For example, an Table 2 Genome features from 24 microbial genome sequencing projects Organism Genome size (Mbp) No. of ORFs (% coding) Unknown function Aeropyrum pernix K1 1.67 1,885 (89%) A. aeolicus VF5 1.50 1,749 (93%) 663 (44%) A. fulgidus 2.18 2,437 (92%) 1,315 B. subtilis 4.20 4,779 (87%) 1,722 B. burgdorferi 1.44 1,738 (88%) Chlamydia pneumoniae AR39 1.23 1,134 (90%) Chlamydia trachomatis MoPn 1.07 936 C. trachomatis serovar D 1.04 928 Deinococcus radiodurans 3.28 E. coli K-12-MG1655 4.60 H. influenzae H. pylori 26695 Unique ORFs 407 (27%) (54%) 641 (26%) (42%) 1,053 (26%) 1,132 (65%) 682 (39%) 543 (48%) 262 (23%) (91%) 353 (38%) 77 (8%) (92%) 290 (32%) 255 (29%) 3,187 (91%) 1,715 (54%) 1,001 (31%) 5,295 (88%) 1,632 (38%) 1,114 (26%) 1.83 1,738 (88%) 592 (35%) 237 (14%) 1.66 1,589 (91%) 744 (45%) 539 (33%) Methanobacterium thermotautotrophicum 1.75 2,008 (90%) 1,010 (54%) 496 (27%) Methanococcus jannaschii 1.66 1,783 (87%) 1,076 (62%) 525 (30%) M. tuberculosis CSU#93 4.41 4,275 (92%) 1,521 (39%) 606 (15%) M. genitalium 0.58 483 (91%) 173 (37%) 7 (2%) M. pneumoniae 0.81 680 (89%) 248 (37%) 67 (10%) N. meningitidis MC58 2.24 2,155 (83%) 856 (40%) 517 (24%) Pyrococcus horikoshii OT3 1.74 1,994 (91%) 859 (42%) 453 (22%) Rickettsia prowazekii Madrid E 1.11 878 (75%) 311 (37%) 209 (25%) Synechocystis sp. 3.57 4,003 (87%) 2,384 (75%) 1,426 (45%) T. maritima MSB8 1.86 1,879 (95%) 863 (46%) 373 (26%) T. pallidum 1.14 1,039 (93%) 461 (44%) 280 (27%) Vibrio cholerae El Tor N1696 800 4.03 3,890 (88%) 1,806 (46%) 934 (24%) 50.60 52,462 (89%) 22,358 (43%) 12,161 (23%) © 2000 Macmillan Magazines Ltd NATURE | VOL 406 | 17 AUGUST 2000 | www.nature.com Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
  • 53. After the Genomes • Better analysis and annotation • Comparative genomics • Functional genomics (Experimental analysis of gene function on a genome scale) • Genome-wide gene expression studies • Proteomics • Genome wide genetic experiments Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014