Presentation on how to chat with PDF using ChatGPT code interpreter
Genomics of cold-adapted microorganisms
1. Some like it cold
What can microbial genomes tell us about life's extremes?
Neil Saunders
School of Molecular and Microbial Sciences, UQ
2. We live on a cold planet
80% of the biosphere is permanently < 5 °C
3. Microbial diversity in cold environments
Estimates 1.3 x 1028 archaeal cells, 3.1 x 1028
●
bacterial cells in the oceans
Other cold environments: polar and alpine
●
regions, permafrost, subsurface
Non-terrestrial environments?
●
Ecological significance: carbon cycle,
●
methane
Biotechnological potential: enzymes
●
Karl et al. (2001). Nature 409: 507-510
4. Archaea and extremophiles
Archaea are the third domain
●
of life, distinct from Bacteria
and Eukarya
Many (but not all) Archaea
●
are extremophiles
Temperature extremes
psychrophile < 20 °C
mesophile 20 - 45 °C
thermophile > 45 °C
hyperthermophile > 80 °C
5. Archaeal isolates from Antarctica
Moderately saline
●
Perennially 1-2 °C
●
Methane-saturated
●
Ace Lake, Vestfold Hills, Antarctica
Isolates Topt Tmin
● Methanogenium frigidum 15 °C < 0 °C
● Methanococcoides burtonii 23 °C < 0 °C
● Halorubrum lacusprofundi
6. Shotgun sequencing of archaeal genomes
Sequencing centres
Joint Genome Institute
●
● Molecular Dynamics/Genome Applications
● AGRF
Statistics
M. frigidum ~ 2x coverage, 10 000 reads
M. burtonii ~ 12x coverage, 50 000 reads
7. Computational infrastructure for genomics
"So what new skills will postdocs need to ensure that
they don't become science relics? The answer is math,
statistics, and knowledge of a scripting language for
computers."
-The Scientist, "Bioinformatics Knowledge Vital to Careers"
Volume 16 | Issue 17 | 53 | Sep. 2, 2002
www.the-scientist.com
8. Computational infrastructure for genomics
Hardware
Biological Analysis ● Workstation?
objects (limitless) ● Cluster?
Genome Sequence analysis
Assembly Regulatory motifs
Computational
objects
Gene sequence Structural modeling
Protein sequence Phylogeny
Protein structure Comparative genomics
Pathway Pathway reconstruction
Software
● Linux
● Databases
Key points
● Web servers
● Linux!
● Toolkits/libraries
● Perl/BioPerl
● Scripts/compiled
● Free, open-source
● Open source
● Many tools + “glue”
● Never-ending...
9. “Global” genomic features of cold-adapted
prokaryotes
Is there anything “obviously different” about genes and
proteins from psychrophilic prokaryotes?
Amino acid composition and protein structure
●
Novel gene products
●
Structural RNA features
●
10. Amino acid composition of the proteome
Archaea Bacteria
27 organisms 52 organisms
62 338 ORFs 165 192 ORFs
Amino acid frequency
(bioperl)
data matrix
organisms (rows) x
composition (columns)
PCA
principal components
(R stats package)
11. Statistical analysis of amino acid composition
Archaea Amino acid composition v. PC2
27 organisms Asp 0.66
His 0.53
PC1 v. GC -0.95 Leu -0.91
PC2 v. OGT -0.94 Gln 0.61
Ser 0.57
Thr 0.72
Trp -0.68
12. Statistical analysis of amino acid composition
Bacteria Amino acid composition v. PC2
52 organisms Asp 0.71
Glu -0.74
PC1 v. GC 0.96 His 0.56
PC2 v. OGT -0.81 Leu -0.41
Met 0.55
Gln 0.55
Ser 0.67
Thr 0.74
13. Protein structure homology modeling
Archaea Bacteria
BLAST v. PDB
27 organisms 52 organisms
select templates
62 338 ORFs 165 192 ORFs
PROSPECT
modeller script
MODELLER
5 513 raw models 20 785 raw models
ProCheck
3 383 models 13 966 models
g-factor > -0.5
DSSP
3 207 models 13 035 models
For the set of models from each organism, calculate
fraction of each residue that is solvent-accessible
Analyse using LDA
14. Analysis of homology models
Archaea Bacteria
LD1 v. OGT 0.89 LD1 v. OGT 0.84
Ala -0.78 Ala -0.41
Asp -0.63 Asp -0.73
Ser -0.62 His -0.41
Thr -0.85 Ser -0.38
Thr -0.46
Trp 0.40
Tyr 0.39
15. Proteins: summary
Psychrophiles, mesophiles and thermophiles can be distinguished by
●
the amino acid composition of the proteome
Composition
In the direction thermophile psychrophile we see:
● increase in non-charged polar (Gln, Ser, Thr), His and Asp
● decrease in hydrophobic (Leu, Trp) and Glu
Accessible surface
● The 3 thermal classes of organism can also be distinguished by the
degree to which certain residues are solvent-accessible
● In general, Asp, Ala, Ser and Thr are more exposed in proteins from
psychrophiles versus thermophiles
Biological rationales
● Thermal denaturation: Gln (deamidation), Thr (peptide cleavage)
● Thermostability: Glu (surface salt bridges), hydrophobic core
● Low temperature function: increased global/local flexibility?
surface destabilisation (hydrophobic) ?
avoid aggregation (polar non-charged) ?
16. Analysis of structural RNA
Is tRNA GC content related to OGT?
stems
Use tRNAScan to find tRNA in
●
archaeal genomes
% GC
Calculate mean GC content for
●
each organism
all bases
OGT (°C)
GC content becomes significant only
above ~ 60 °C
Flexibility and nucleoside modification
M. burtonii tRNA contains > 1
dihydrouridine/molecule
(Noon et al. 2003, J. Bact. 185: 5483)
17. Cold shock protein in M. frigidum
First CSP identified in a psychrophilic
●
archaeon
Contains all conserved residues for RNA
●
binding
Is being functionally and structurally
●
characterised
18. CSD-like proteins in M. burtonii
No CSP homologue identified in M. burtonii
●
csp mutants of E. coli can be complemented by proteins with a CSD-fold
●
Does M. burtonii express novel CSD-like proteins?
●
Protein sequences
PROSPECT
thread v. CSD folds
MODELLER
d1sro__ M. burtonii YP_564958
structural model
19. Proteomic studies of M. burtonii
What's expressed at 4 °C ?
●
What's different at 4 °C versus 23 °C ?
●
Protein identification is easy with
●
a genome sequence!
Work performed by Amber Goodchild at the BMSF, UNSW
●
2D-PAGE and LC MS/MS both employed
●
20. What's different between 4 °C and 23 °C?
237 spots analysed
●
21 spots more intense at 4 °C
●
33 spots more intense at 23 °C
●
19/21 and 24/33 identified
●
Upregulated 4 °C
● RNAP subunit E
● Methanogenesis
● Acetate -> amino acid biosynthesis
● CheY-like response regulator
● Peptidyl prolyl cis/trans isomerase
Upregulated 23 °C
● DnaK/HSP70
Goodchild et al. (2004b). Mol. Microbiol. 53: 309
21. Protein modifications and new amino acids
Several spot patterns indicate PTM
●
Trimethylamine methyltransferase
●
(TMA-MT) maps to 2 ORFs
This results from read-through of an
●
in-frame amber UAG codon
The amino acid incorporated at the UAG
●
is pyrrolysine - the 22nd genetically-encoded
amino acid.
Hao et al. (2002). Science 296: 1459.
22. What's expressed at 4 °C? LC MS/MS
528 proteins identified
●
~ 23% of the proteome
●
DNA replication/processing Energy
Proteins annotated and classified
production/conversion
●
Transposases
Carbon fixation/carbohydrate
Cell division/chromosome
metabolism
partitioning
by (1) biological process, (2)
Nucleotide metabolism
Defense mechanisms
Amino acid metabolism
RNA synthesis/processing
genome organisation
Coenzyme metabolism
Signal transduction
Unassigned
Motility
Protein synthesis/processing
Goodchild et al. (2004a). J. Prot. Res. 3: 1164
Protein
135 hypothetical/conserved
PTM/degradation/folding ●
Cell envelope
hypothetical proteins analysed
Transport
Methanogenesis
separately
Goodchild et al. (2004a). J. Prot. Res. 3: 1164
Some key processes
Putative exosome/proteasome components
Expression of 2 transposases
●
Protein folding (chaperones,
●
chaperonins, isomerases)
RNA and protein processing
●
(exosome/proteasome superoperon)
Our predicted CSD-like proteins are
●
Koonin et al. (2001). Genome Res. 11: 240
part of the putative exosome
23. Conclusions: the biology
Cold physiology is a complex process; no “gene for cold adaptation”
●
Features of psychrophilic archaea include:
●
Higher proportion of polar non-charged amino acids
➢
More hydrophobic, less charged solvent-accessible surface
➢
Modified structural RNAs for increased flexibility
➢
Membrane lipid unsaturation
➢
Complex transcriptional and translational regulatory networks
➢
Metabolic regulation: energy production v. biosynthesis
➢
Mechanisms to promote proper protein folding
➢
Coupled regulation of RNA/protein synthesis and turnover
➢
24. Conclusions: the computers
Biological system Biological objects
Computational objects
Biological inferences Analyses
Generic approach to biological problems
25. Future directions
M. burtonii Genome closed, released April 2006
●
M. frigidum High coverage draft planned (JCVI)
●
H. lacusprofundi Scheduled for sequencing (JGI)
●
Other UNSW projects
Sphingopyxis alaskensis Genome closed, due for release
●
Marine and environmental microbiology
●
Pseudoalteromonas tunicata JCVI
Vibrio angustum JCVI
Roseobacter gallaeciensis JCVI
LAS-degrading consortium (3 organisms) JGI
26. Acknowledgements
UNSW BABS UNSW BMSF UNSW Physics
Rick Cavicchioli Mark Raftery Paul Curmi
Sohail Siddiqui Mike Guilhaus
Torsten Thomas
Amber Goodchild
Laura Giaquinto
Dominic Burg
Lily Ting
Davide de Francisci
Charmaine Ng
Marilyn Katrib
Sequencing Centres CSIRO
Joint Genome Institute, CA, USA Peter Franzmann
Genomics Applications, CA, USA
Venter Institute/Moore Foundation, MD, USA
AGRF, Brisbane