How Automation is Driving Efficiency Through the Last Mile of Reporting
Comparative genomics of the fungal kingdom
1. Comparative genomics of the fungal kingdom: a view from the
chytrids
Jason Stajich
University of California, Berkeley
2. Comparative Genomics
• Tools for studying evolution at level of genomic blueprints.
• Identifying shared, unique, loss and gains of genes.
• Signatures of adaptation
• Identify genes that are under positive directional selection - changing faster at the
amino acid level than expected given neutral rate
• Identification of gene families that expand or contract by unexpected amounts
• Contrasting genome organization and evolution of genomic clusters of genes
3. Fantastic Fungi
• Evolution of modern fungal forms and lifestyles
• Evolution of Multicellularity - independent transitions in Metazoa and Plants.
• Reversions to unicellularity
• Evolution of development; early genes involved in fruiting body development
• Plants and Fungi have cell walls; animals lack cell walls; what were fungal ancestor’s
cell walls like? Fungal-animal ancestor?
• What genes were in the ancestral fungus? Which genes have newly evolved and are
contribute to new morphologies or life stages?
4. Fantastic opportunities in fungal comparative genomics
• More than 65 available genomes - dozens more in pipeline at sequencing centers
• http://fungalgenomes.org/wiki/Fungal_Genome_Links
• 1(2) Chytrid, 2 Zygomycetes, 8 (12) Basidiomycetes, 3-4 Taphrinomycotina,
• ~30 (+15 strains Coccidoidioides, 3 strains of Histoplasma) Pezizomycotina
• ~22(+20-100 strains S. cerevisiae & S. paradoxus) Saccharomycotina
• Broad Institute & Fungal Genome Initiative, Joint Genome Institute, Stanford Genome
Technology Center, Sanger Centre, Génolevures project & CNRS, BC Genome
Sequencing Center, others.
• US genome sequencing funding: NSF, DOE, NIH
5. Genome annotation
• Train ab initio gene predictors
• Build good models from protein to genome alignments of take set of curated genes.
Build full-length models from cDNA or assembled ESTs
• Trains on exon-intron, intron length, exon length, and codon/nt biases
• Refine parameters using iterative manner with some gene models held out to assess
improvements
• Generate and combine Annotations
• Take ab initio, homology based, and EST tracks
• Combine into consensus gene models
• GLEAN or Jigsaw (GAZE also)
• Assess performance of different datasets, leave out some models if necessary
6. Combined predictions
perform better
scaffold_5
1219k 1220k 1221k
% gc
58%
17%
GLEAN
BDEN_JAM81_00470 BDEN_JAM81_00471
probability 0.765437 probability 0.981985
SNAP genes
lenx_scaffold_5-snap.460 lenx_scaffold_5-snap.461
Twinscan genes
TS.scaffold_5.413
Genewise genes
dhan_DEHA0E17479g__scaffold_5__1216332__1226931
egos_AGR101C__scaffold_5__1216332__1226940
klac_KLLA0F11957g__scaffold_5__1216332__1226931
ctro_CTRT_03542__scaffold_5__1216332
lelo_LELT_03523__scaffold_5__1216332
AUGUSTUS genes
scaffold_5-augustus-g372.t1
PASA EST genes
Model.asmbl_4025
Model.asmbl_4026
7. Combined predictions
perform better
scaffold_5
1219k 1220k 1221k
% gc
58%
17%
GLEAN
BDEN_JAM81_00470 BDEN_JAM81_00471
probability 0.765437 probability 0.981985
SNAP genes
lenx_scaffold_5-snap.460 lenx_scaffold_5-snap.461
Twinscan genes
TS.scaffold_5.413
Genewise genes
dhan_DEHA0E17479g__scaffold_5__1216332__1226931
egos_AGR101C__scaffold_5__1216332__1226940
klac_KLLA0F11957g__scaffold_5__1216332__1226931
ctro_CTRT_03542__scaffold_5__1216332
lelo_LELT_03523__scaffold_5__1216332
AUGUSTUS genes
scaffold_5-augustus-g372.t1
PASA EST genes
Model.asmbl_4025
Model.asmbl_4026
8. Combined predictions
perform better
scaffold_5
1219k 1220k 1221k
% gc
58%
17%
GLEAN
BDEN_JAM81_00470 BDEN_JAM81_00471
probability 0.765437 probability 0.981985
SNAP genes
lenx_scaffold_5-snap.460 lenx_scaffold_5-snap.461
Twinscan genes
TS.scaffold_5.413
Genewise genes
dhan_DEHA0E17479g__scaffold_5__1216332__1226931
egos_AGR101C__scaffold_5__1216332__1226940
klac_KLLA0F11957g__scaffold_5__1216332__1226931
ctro_CTRT_03542__scaffold_5__1216332
lelo_LELT_03523__scaffold_5__1216332
AUGUSTUS genes
scaffold_5-augustus-g372.t1
PASA EST genes
Model.asmbl_4025
Model.asmbl_4026
9. Combined predictions
perform better
scaffold_5
1219k 1220k 1221k
% gc
58%
17%
GLEAN
BDEN_JAM81_00470 BDEN_JAM81_00471
probability 0.765437 probability 0.981985
SNAP genes
lenx_scaffold_5-snap.460 lenx_scaffold_5-snap.461
Twinscan genes
TS.scaffold_5.413
Genewise genes
dhan_DEHA0E17479g__scaffold_5__1216332__1226931
egos_AGR101C__scaffold_5__1216332__1226940
klac_KLLA0F11957g__scaffold_5__1216332__1226931
ctro_CTRT_03542__scaffold_5__1216332
lelo_LELT_03523__scaffold_5__1216332
AUGUSTUS genes
scaffold_5-augustus-g372.t1
PASA EST genes
Model.asmbl_4025
Model.asmbl_4026
10. Combined predictions
perform better
scaffold_5
1219k 1220k 1221k
% gc
58%
17%
GLEAN
BDEN_JAM81_00470 BDEN_JAM81_00471
probability 0.765437 probability 0.981985
SNAP genes
lenx_scaffold_5-snap.460 lenx_scaffold_5-snap.461
Twinscan genes
TS.scaffold_5.413
Genewise genes
dhan_DEHA0E17479g__scaffold_5__1216332__1226931
egos_AGR101C__scaffold_5__1216332__1226940
klac_KLLA0F11957g__scaffold_5__1216332__1226931
ctro_CTRT_03542__scaffold_5__1216332
lelo_LELT_03523__scaffold_5__1216332
AUGUSTUS genes
scaffold_5-augustus-g372.t1
PASA EST genes
Model.asmbl_4025
Model.asmbl_4026
11. • Consensus tree of 42 fungal
genomes based on many
thousands of orthologous genes
• Not perfect, but automated
reconstruction can be powerful tool
• Conflicts in topology can identify
genes with interesting history
Fitzpatrick DA, Logue ME, Stajich JE, Butler G.
BMC Genomics 2006
12. Complex fungal genes
• Modern fungi have complex gene structures. How complex were
gene structures in the fungal ancestor?
• Many introns are present in fungal genes
• Intron poor Saccharomyces, U.maydis, and S.pombe are derived
• Evolution of introns in fungi has seen many losses, few gains
13. Fungal intron size and frequency evolution
500
Hemiascomycota
C. glabrata
Median intron length (bp)
400
300
K. lactis
U. maydis
B.dendrobatidis
Y. lipolytica
200
Euascomycota
Basidiomycota
S.cerevisiae
Zygomycota
100
C. cinerea P. chrysosporium
R. oryzae
C. neoformans
S. pombe
0
0 1 2 3 4 5 6 7
Stajich JE, Dietrich FS, and Roy SW.
Mean number of introns per kb of coding sequence Genome Biology In revision
15. Intron loss predominates in fungal lineages
Saccharomycetes
P. chrysosporium
Sordariomycetes
Eurotiomycetes
C. neoformans
A
S. nodorum
Vertebrates
Y. lipolytica
A. thaliana
C. cinerea
U. maydis
S. pombe
R. oryzae
5.51 6.62 2.28 0.21 3.80 3.89 3.90 0.52 0.88 1.16 0.97 0.07 0.02
4.03
1.20
0.07
3.59
2.36
2.77
3.59
3.59
3.87
4.98
Stajich JE, Dietrich FS, and Roy SW.
Genome Biology In revision
16. Intron loss in C. neoformans through mRNA intermediete
C
A
C. gattii, strain WM276
JEC21
BT-100
BT-157
WM276
BT-63
R265
H99
35-23
2462
C. gattii, strain R265
C. neoformans var. neoformans, strain JEC21
C. neoformans var. grubii, strain H99
1.0
B
1kb 5 kb
2 kb 3 kb 4 kb 6 kb
1 2 3 4 5 6 78 9 10 11 12 13 14 15 16 17 18 19 20 21 22
1 2 3 4 5 6 78 9-19 20 21 22
Stajich JE, Dietrich FS. Euk Cell 2006
17. Intron gain is rare
• Two studies looked at intron loss and gain in 4 closely related C. neoformans (Sharpton et
al, submitted; Stajich and Dietrich 2006) and found little or no intron gain.
• Nielsen et al, Plos Biology 2004 found moderate amount of intron gain among
Pezizomycota
• Intron gain IS happening in lineages but among sampled closely related genomes there are
few examples of intron gains...
• ... and little convincing evidence of the molecular mechanism of this gain (duplication, self-
splicing, de-novo intron creation)
• More work needed to understand dynamics and mechanisms of gene structure change
18. B. dendrobatidis genomics
• Amphibian pathogen killing frogs
worldwide
• Chytrid fungus with motile zoospore
and zoosporangia stage
• Genome sequencing of 2 strains
• JEL423 (Joyce Longcore; Panama) motile
and JAM81 (Jess Morgan; Sierras, zoospore
California)
• 24 Mb genome; ~8,000 genes
• Tiling genomic microarray and exon
array in development (Eisen lab)
zoosporangia
19. B. dendrobatidis genomics
• Amphibian pathogen killing frogs
worldwide
• Chytrid fungus with motile zoospore
and zoosporangia stage
• Genome sequencing of 2 strains
• JEL423 (Joyce Longcore; Panama) motile
C. neoformans ~7,000
and JAM81 (Jess Morgan; Sierras, zoospore
C. cinereus ~10,000
California)
U. maydis ~7,000
S. cerevisiae ~6,000
• 24 Mb genome; ~8,000 genes
A. fumigatus ~10,000
• Tiling genomic microarray and exon
array in development (Eisen lab)
zoosporangia
20. Gene structure evolution:
B.dendrobatidis genes are intron rich
B.dendrobatidis
BDEN_JAM81_01417
U.maydis
UM03290.1
P.chrysosporium
GLEAN_01130
S.pombe
SPAC644.14c
N.crassa
NCU02741.1
S. cerevisiae
YER095W
Strand exchange protein, forms a helical filament with DNA that searches for homology; involved in the recombinational repair of double-strand
breaks in DNA during vegetative growth and meiosis; homolog of Dmc1p and bacterial RecA protein
21. Phylogenetic profiling
• Classify a genes as to which phylogenetic clades it shares homologs with.
• Can be simply a similarity search (BLAST) to representatives genomes.
• Summarize the number of shared genes by different patterns
• Using Chytrid genes to identify genes present in ancestor, shared with animal
outgroup.
• Find genes lost at different part of tree
• By comparing all genes in lineages back to Chytrid can identify potential gene gains
25. Evolution of cell walls
• Fungal cell wall are made of
• Chitin, Beta-glucans, Mannin,
other sugars
• Animals lack cell walls
• Plants have rigid cell walls
• Can learn about opisthokont ancestor
from learning about the ancestral
fungus
Baldauf SL. Science 2003
28. Flagella in fungi
• Loss of flagella was a one or a few
events
• Find shared genes in animal and
Chytrid genomes but missing fungi
• Many of these genes are even
shared with cillia & flagellar genes
with Chlamydomonas.
• Microarray expression data
differences between zoospore and
sporangia
• Flagella Dynein 64x up regulated in
zoospores.
29. Hypothesis for new cell wall genes and transition to
terrestrial life
• Cell wall of ancestral fungus adapted for aquatic fungus which had flagella.
• Loss of flagella as part of adaptation to terrestrial life.
• Additional gene family duplication and specialization.
• Chitin synthase expansions
• FKS1 1,3-Beta-glucan pathway evolution
• Substrate for complex multicellular evolution and morphological elaboration.
30. Collaboration
• Erica Rosenblum, Michael Eisen, John Taylor; University of California, Berkeley
• Igor Grigoriev, Alan Kuo; DOE Joint Genome Institute
• Christina Cuomo, Antonis Rokas; Broad Institute of MIT and Harvard
• Tim James; Uppsala University
• http://fungal.genome.duke.edu - genome browser and annotations
• http://fungalgenomes.org
• Blog & Wiki for Genome data
• Coming soon: Genome Browser and comparative resources