Group presentation that contains:
- outlining the basics of translation
- experimental evidence that shows proteins from synonymous mRNA sequences differ
- hypothesis for how synonymous codons effect the resulting protein structure
- the methodology I use to test for the conservation of codon choice within related proteins
4. Renaturation
“The original structure of some proteins can be regenerated upon removal of
the denaturing agent and restoration of conditions favouring the native state.
Proteins subject to this process, called renaturation, include serum albumin
from blood, hemoglobin (the oxygen-carrying pigment of red blood cells), and
the enzyme ribonuclease”
- Encyclopedia Britannica
All the information is contained
in the protein sequence!
Who cares about degeneracy?!
5. Question - Experimental “Oddities”
Synonymous switches have an effect:
● Can cause exons to be skipped
● Can cause a reduction in activity
● Can cause misfolding
7. Prior Work
“N-terminal regions are generally translated slower than C-terminal regions”
- Saunders & Deane (2010 )
“the first 5-10 codons of protein-coding genes are often codons that are less
frequently used in the rest of the genome”
- Bentele et al. (2013)
“cell cycle-regulated genes expressed in different phases display different
codon preferences”
- Morgenstern et al. (2012)
9. Starting point - CSandS (2010)
Mapping of mRNA seq to
protein seq
● 4000+ matches
● High quality
● Human curated
● Structural Information
● Taxa Information
● Bad documentation
Saunders R, Deane CM, Nucleic Acids Res., 2010, 38(19), 6719-28.
10. Modifying the database
Added
● SCOP Families
(SCOP 1.75B)
● tRNA gene copy #
(GtRNAdb)
● SCOP family structural
alignment
(MAMMOTH-Mult)
Removed
● Enforce 40% seq id
● NMR experiments
● Minimum of 7 in
SCOP family
● Organisms without
tRNA data
● Misaligned families
SCOP families: 43
Structural Domains: 454
12. Scoring a SCOP family (1)
Protein Sequence
pdb-1 (HUMAN) V F T V E V K N Y G
pdb-2 (ECALL) V Y N V Y V R - N G
pdb-3 (HUMAN) K Y K A E W R A V G
pdb-4 (YEAST) - - - - D V P G D R
mRNA Sequence
pdb-1 (HUMAN) ACU GUU GAA GUC AAA AAC UAC GGA
pdb-2 (ECALL) AAU GUA UAU GUU CGA --- AAC GGA
pdb-3 (HUMAN) AAG GCC GAG UGG CGU GCU GUG GGC
pdb-4 (YEAST) --- --- GAU GUG CCA UGU GAC AGG
Structural alignment produced
by MAMMOTH-mult on SCOP
family domain fragments
Known mRNA sequence
mapped onto alignment
Mapping mRNA
One to one matching of codons to amino acids.
100% coverage by mRNA sequence
Codon > amino acid if any difference
13. Scoring a SCOP family (2)
mRNA Sequence
pdb-1 (HUMAN) ACU GUU GAA GUC AAA AAC UAC GGA
pdb-2 (ECALL) AAU GUA UAU GUU CGA --- AAC GGA
pdb-3 (HUMAN) AAG GCC GAG UGG CGU GCU GUG GGC
pdb-4 (YEAST) --- --- GAU GUG CCA UGU GAC AGG
Translation Scores
pdb-1 (HUMAN) 0.3 0.9 0.1 0.6 0.4 0.1 0.8 0.6
pdb-2 (ECALL) 0.5 0.8 0.4 0.9 0.5 --- 0.6 0.5
pdb-3 (HUMAN) 0.6 0.6 0.1 0.6 0.9 0.2 0.1 0.1
pdb-4 (YEAST) --- --- 0.2 0.7 0.4 0.1 0.7 0.5
Organism specific
translation speed scores
given to each codon.
Profile is then smoothed.
Translation Speed Scores
Using the tRNA Adaptation Index (tAI).
This is determined by : - tRNA gene copy number
- Simple Crick’s wobble pairing
Other scoring systems exist.
14. Scoring a SCOP family (3)
Optimality Thresholds
Determined using the organism specific open reading
frames within database.
Manually specified thresholds.
Issues with organisms present in low frequency.
Translation Scores
pdb-1 (HUMAN) 0.3 0.9 0.1 0.6 0.4 0.1 0.8 0.6
pdb-2 (ECALL) 0.5 0.8 0.4 0.9 0.5 --- 0.6 0.5
pdb-3 (HUMAN) 0.6 0.6 0.1 0.6 0.9 0.2 0.1 0.1
pdb-4 (YEAST) --- --- 0.2 0.7 0.4 0.1 0.7 0.5
Optimality Scores
pdb-1 (HUMAN) 0 +1 -1 0 0 -1 +1 0
pdb-2 (ECALL) 0 +1 0 +1 0 -- 0 0
pdb-3 (HUMAN) 0 0 -1 0 +1 -1 -1 -1
pdb-4 (YEAST) -- -- -1 0 0 -1 0 0
Organism specific
thresholds determine
which codons are optimal
(+1) , nonoptimal (-1), or
neither (0).
15. Scoring a SCOP family (4)
Conservation Scores
Simple codon-wise average of optimality scores.
Must have at least 5 codons in an aligned column.
Randomisation of optimality scores produces SCOP family specific specified thresholds (5%).
Optimality Scores
pdb-1 (HUMAN) 0 +1 -1 0 0 -1 +1 0
pdb-2 (ECALL) 0 +1 0 +1 0 -- 0 0
pdb-3 (HUMAN) 0 0 -1 0 +1 -1 -1 -1
pdb-4 (YEAST) -- -- -1 0 0 -1 0 0
Conservation Scores
SCOP family specific
thresholds determine
optimal (red) and
nonoptimal (blue)
conserved codons.
16. Scoring a fold family - Summary
Structural
Alignment
Conserved
Codons
1. Map mRNA Seq.
2. Attribute translation
speed scores to each
Codon.
3. Assign optimal, non-
optimal or neither to each
codon.
4. Determine conservation
scores for each column.
18. Is there any conservation?
How many SCOP families have more conserved
residues than expected by chance?
Optimality
Assignment
Thresholds
19. Looking forward
● Remove signal from conserved residues
● Correlation to structural features
● Update the CSandS database
● Investigate the ribosome tunnel
● Subgroup analysis - renaturation, chaperone