Protein function depends on the structure of a protein in the cell, and many significant changes in protein sequence are known which preserve 3D structure (eg circular permutations or fusion of interacting monomers). We discuss algorithms for detecting such rearrangements, and provide a framework for interpreting them in the context of evolution with the goal of explaining the emergence of novel protein folds.
This poster was created for ISMB 2012.
ABSTRACT
The nature of protein fold space is hotly debated. Do the protein folds observed in nature fall into clean, discrete clusters, or is fold space more accurately modeled as a vast continuum, of which only a small sample of proteins has yet been observed? Previous efforts to answer this question have focused on geometric spaces (PCA, multidimensional scaling, locally linear embedding) or network models (conformational space networks) (Chodera, 2011). While such schemes may facilitate protein comparison and classification, the choice of a mathematical framework for fold space is arbitrary without a connection to concrete biological processes. To accurately capture the true relationships between protein folds, a model must consider the evolutionary history of those folds.
Here we present a high-level model of protein evolution, which focuses on mutations that preserve the global 3D structure of proteins. We hypothesize that the combination of subtle local changes (e.g. PTMs) and large, but structure-preserving, rearrangements (e.g. duplications) can account for both the continuity of intermediate structures within protein folds and the evolution of seemingly novel folds. Our model categorizes known biological mutation processes, such as DNA replication errors and crossover errors, and places them in a simple theoretical framework.
To test this model, we present evidence from the analysis of a recent systematic comparison of all protein domains from the Protein Data Bank (PDB). We also show that the model is consistent with existing evolutionary models for gene duplication, circular permuted proteins, and proteins with internal symmetry. Future work will focus on explicitly determining evolutionary events relating distantly homologous folds.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
ISMB 2012 Poster - The Evolution of Protein Folds
1. The Evolution of Protein Folds
Spencer Bliven Andreas Prlić Philip Bourne
Department of Bioinformatics San Diego Supercomputer Center Skaggs School of Pharmacy and Pharmaceutical Sciences
University of California San Diego University of California San Diego University of California San Diego
Abstract
The nature of protein fold space is hotly debated. Do the protein folds observed in nature fall into Here we present a high-level model of protein evolution, which focuses on mutations that preserve To test this model, we present evidence from the analysis of a recent systematic comparison of all
clean, discrete clusters, or is fold space more accurately modeled as a vast continuum, of which only the global 3D structure of proteins. We hypothesize that the combination of subtle local changes protein domains from the Protein Data Bank (PDB). We also show that the model is consistent with
a small sample of proteins has yet been observed? Previous efforts to answer this question have (e.g. PTMs) and large, but structure-preserving, rearrangements (e.g. duplications) can account for existing evolutionary models for gene duplication, circular permuted proteins, and proteins with
focused on geometric spaces (PCA, multidimensional scaling, locally linear embedding) or network both the continuity of intermediate structures within protein folds and the evolution of seemingly internal symmetry. Future work will focus on explicitly determining evolutionary events relating
models (conformational space networks) (Chodera, 2011). While such schemes may facilitate protein novel folds. Our model categorizes known biological mutation processes, such as DNA replication distantly homologous folds.
comparison and classification, the choice of a mathematical framework for fold space is arbitrary errors and crossover errors, and places them in a simple theoretical framework.
without a connection to concrete biological processes. To accurately capture the true relationships
between protein folds, a model must consider the evolutionary history of those folds.
Principles of Protein Evolution Case Study: Transmembrane Proteins
Methods
1. Global structure is conserved in many significant rearrangements PDP:2YVYAa
0.63
0.7
0.71
0.72
PDP:3TPOAa
0.68
PDP:3EBBAa d1xqra1
0.74
0.76
0.74
0.62
0.56
0.68
0.6
0.63
PDP:2Z5KAa
0.54
0.62
PDP:2Z5KAc PDP:2XWUBc
0.62
0.59
0.65
0.64
0.7
0.61
0.72
PDP:2XWUBa
PDP:2XWUBb
0.5
PDP:2KBIAa
0.75
0.5
d2pq3a1
0.73
PDP:2O7IAa
PDP:2NOOAa
1.Identification of domain and subdomain architecture
0.6
0.57 PDP:3ICQTa PDP:3ICQTc
0.57
0.53 0.540.58 0.53
0.6 0.57
a. Use SCOP domains where available, or calculate using the Protein
PDP:2Z5KAd
0.71 0.57 0.58
0.53
0.51 0.51 0.62
0.69 0.6 0.53
0.52 0.54
0.54
PDP:3ICQTf
0.51 0.52
0.57 0.54 0.62
Domain Swapping Circular Permutation Symmetry
0.97 0.6 0.53 PDP:2EGDAa
PDP:1YGMAa 0.71
0.55
0.63 0.51
PDP:2ZY9Aa 0.66
PDP:2Z5KAb
0.55
0.52 PDP:2YVYAb 0.68 0.630.5 0.510.53
d1k8ua_
0.61
PDP:3TPOAb0.54
0.64 0.51
0.6 0.54
PDP:2XWUBe PDP:1W99Aa
0.53
0.78
PDP:2ZHJAb 0.58 0.64
0.62 0.61
0.55 0.54
0.51
0.59
PDP:2XWUBd
0.77
PDP:3G2SAa PDP:3AG3Ea
Domain Parser (PDP) algorithm
PDP:3SS1Af d1bo9a_ 0.61 0.52
0.5
0.57
PDP:3A09Aa
0.52 0.55 0.54 PDP:2QX5Ac 0.5
PDP:3I5PAb PDP:3ICQTh PDP:3ICQTd PDP:3L9SAb 0.61
0.6
0.57
0.62
0.56 0.590.62 0.61
0.59 0.53
0.67 0.61 0.62
0.6
0.52 0.64
d1vmaa1
0.85 0.610.62
0.59 0.5 d1nkta2 0.82 0.57
PDP:4DDJAa 0.52 PDP:3FP3Ae 0.78 0.76 0.58
0.57
0.57 0.990.97
0.98
0.57 0.57
0.58 0.98
0.99 0.96
0.98