threading and homology modelling methods

THREADING AND
HOMOLOGY MODELING
METHODS
Preapred by
Muhammed muzammil
1st year mpharm
Departement of pharmacoloy
4/6/2019 SRINIVAS COLLEGE OF PHARMACY 1

INTRODUCTION
• Proteins play essential roles in most biological processes. While some
proteins are involved in chemical reactions as enzymes, others like
hemoglobin and myoglobin are involved in the transport and storage
processes.
• Also, some proteins are involved in control of the growth and
differentiation of cells. Composed of twenty types of amino acids, proteins
fold into unique three-dimensional structures that are closely related to their
biological functions.
• Malfunctions of proteins are often the cause of fatal diseases, thus
understanding the structures of proteins and their related functions in
various biological mechanisms are important subjects of studies.

Continue…
• Because of the close relationship between the structure and the function of
a protein, determining the three-dimensional native state structure of a
protein is very important. X-ray crystallography and NMR spectroscopy
have served as major experimental tools for the protein structure
determination . Nonetheless, these experimental tools have limitations in
determining the structures of some proteins and are very time consuming
and expensive.
• For example, some proteins are very difficult to crystallize, which hampers
the structure determination by x-ray crystallography. NMR spectroscopy
also has limitations, for example, in that currently it is applicable only to
proteins with less than about 300 residues.
• One other example is the structure determination of membrane proteins.
Membrane proteins are located in the lipid bilayer and of importance in the
transport of the proteins across the membrane and many other processes.

• These membrane proteins have very different environment from that of
other soluble proteins. While other cellular proteins have polar
environment, which is aqueous, membrane proteins reside in the lipid
bilayer which is hydrophobic. Thus, the structure determination of the
membrane proteins by conventional experimental tools is particularly
challenging.
• With the advent of genome projects, the identification of the protein
sequences has been accelerated, but the speed of the structure
determination and functional assignments has been much slower.
• The development of the sequence alignment techniques such as FASTA,
BLAST, and PSI-BLAST increased the pace of the gene annotation and
functional assignment by computationally measuring the similarities of the
DNA and protein sequences of various organisms.
• Because the proteins with similar sequences usually share a common
structure and function, these techniques can also be used to model the
structure of a protein of unknown structure which has sequence similarity
to the proteins of known structure.

• However, when the sequence similarity between proteins drops to
insignificant level, relying on a sequence similarity alone cannot detect the
structural similarity between the proteins.
• Thus, new techniques that incorporate the structural features that cannot be
detected by sequence alignment needed to be developed . Intensive efforts
to develop the tools for protein structure prediction by computational
methods have produced many useful tools.
• Protein structure prediction methods can be classified into three types
depending on the homologous structures available from the existing
structural data base, and the degree of the structural information
incorporated: homology modeling, threading, and ab initio.
• Homology modeling method for protein structure prediction largely relies
on the sequence similarity between the target protein and the homologous
protein in the structure data base (sequence similarity > 30%).
• Threading method more focuses on the structural similarity between the
target protein and the template structure in the data base without sequence
similarity (sequence similarity < 30%).

• Profile-based threading methods successfully included structural
information in the protein structure prediction process by incorporating the
structural environmental classes of the amino acids in the template
structure.
• The advantage of these methods is that, by converting the structural
information of the amino acids into one-dimensional string, fast and
efficient dynamic programming could be easily introduced, which
tremendously increased the speed of the alignment.
• Threading methods which directly include the contact information among
residues can better incorporate the structural information but the speed of
the alignment is much slower than those using dynamic programming
technique

HOMOLOGY MODELING
METHOD
• When a target protein of unknown structure has structural homolog in the
structure data base, the structure of the target protein can be modeled by
using the homologous structure as a template.
• For this, first the target protein sequence needs to be aligned against the
template protein sequence whose structure is already experimentally
determined.
• Homology modeling has been so far the most successful method for protein
structure prediction, if there exists sequence similarity above 30%

• To obtain optimal alignment between the target sequence and the template
sequence, they aligned two sequences in two dimensional array.
• The number in each cell is the weight for the substitution of an amino acid
by the other amino acid, thus a large number means that an amino acid is
likely to be substituted by the other amino acid.
• Once the alignment between the target sequence and the template sequence
is obtained by sequence alignment tools, the native structure of the target
protein needs to be modeled based on the sequence alignment.
• The basic idea of homology modeling is that the backbone structures of the
target protein is the same as that of the template protein structure, which is
sometimes not true.
• Although the backbone structure of the target protein and the template
protein is nearly the same, the conformation of the side chains may be very
different.

STEPS IN MODEL PRODUCTION
• The homology modeling procedure can be broken down into seven
sequential steps:
1. template recognition and initial alignment
2. Alignment correction
3. Back bone generation
4. Loop modelling
5. Side chain modelling
6. Model optimization
7. Model validation

Template recognition and initial alignment :
• Compare the sequence of the unknown protein with all the sequence of
known structures store in protein data bank
• Blast this sequence against PDB sequences- obtain a list of known protein
structures that match the sequence
• Blast uses a residue exchange scoring matrix . Residues that are easily
exchanged get a better score than residues that have different properties.
• Function specific conserved residues get best score.
• Blast will provide a list of possible templates for the unknown structure. To
make the best initial alignment , blast uses an alignment matrix based on
residue exchange matrix and adds extra penalties for opening and extension
of a gap between residues
• The target sequence is sent to a blast server, which searches the pdb to
obtain a list of possible templates and their alignments.
• The best hit has to be chosen, which is not necessarily the first one

Alignment correction
• fine tune and adjust the blast alignments
• Example : al > glu is possible but unlikely in a hydrophobic core, so these
residues should not be aligned
• Examine the template structure to check which residues are in the core
hence likely to change than residues at the outside
• Insertions and deletions can be made in those parts of the sequence which
are highly variable
• These can be done region of protein which are highly variable
• Shift the gap after deletions to be aligned properly

Backbone generation and loop modelling
• The coordinates of the template backbone are copied to target structure
from pdb
• When the residues are identical, the side chain coordinates are also copied.
• Note that pdb file may contain small offsets or errors , so try to use multiple
similar templates.
• When a target sequence contain a gap, one option is to delete the
corresponding residues in template. But this create a fracture in the
template.
• When the template sequence contains a gap, there are no backbone
coordinates known for these residues in model. The target back bone has to
be cut to insert newer residues.
• These major changes cannot be modeled in secondary structure elements
hence place them in loops and strands therefore surface loops are flexible
and difficult to predict

Side chain modeling
• Note that the conserved residues were already copied> now we just need to
place the side chains
• Copy the torsion angles carbon alpha/beta to the target.
• Rotamers tend to be conserved in homologous proteins and can be predicted as
backbone configuration strongly prefer a specific rotamer.
• Moreover, libraries of flanking or neighboring residues can also help to
estimate the side chain positioning.
• The backbone of tyrosine strongly prefers two rotamers and the real side chain
may fit one of them
Model optimization:
• What is need for further optimization?
• Because ethe updated side chains can effect the backbone and this can effect
the structure prediction
Model validation:
so the model should be checked again for normal ranges of bumps, bond angles ,
torsion angles, bond lengths. Other properties ,like the distribution of polar/ apolar
residues can be compared with real structures.

Limitation
• Limited to structure of template.
• Cannot study conformational changes

Threading model
• In homology model of prediction of protein structure we obtain sequences
of the target protein and we sent to protein data base to get a matching
protein sequence and we drawn out a similar structure to template protein.
• What if we do not get a matching sequence from protein data base?
• The second method to follow when this problem arises is threading method
or fold recognition method.
• In this we recognize motifs that is combination of secondary structures of
protein and we search the data base of secondary structure.

• A protein fold is defined by the way the secondary structure elements of the
protein structure are arranged relative to each other in space.
• The secondary elements include alpha helixes, beta pleat sheet , folds ,
coils etc
• You will be surprised know that in nature only 5000 stable protein folds are
present.
• If we have data base of folds that will help us in protein structure
recognition.
• Fold recognition means finding the best fit of a sequence to a set of
candidate folds.

Application of nmr and xray in proteomics

NMR APPLICATION
• About 17% structure deposited in protein data bank , most of which donot
have corresponding crystal structures which is solved by NMR
spectroscopy
• It is the basis for a wide range of experiments to determine stucture
function relationship
• To investigate dynamics of proteins
• To distinguish multiple conformations
• To compare apo and holo form of proteins and map the binding site of their
co factors
• Weakly binding ligands can be determined by nmr spectroscopy

REFERENCE
• G.T Montelione, D Zheng, Y.J Huang, K.C Gunsalus, T SzyperskiProtein
NMR spectroscopy in structural genomics Nat. Struct. Biol., 7 (2000),
pp. 982-985
• Bowie JU, Lüthy R, Eisenberg D (1991). "A method to identify protein
sequences that fold into a known three-dimensional
structure". Science. 253 (5016): 164–170.
• Marti-Renom, MA; Stuart, AC; Fiser, A; Sanchez, R; Melo, F; Sali, A.
(2000). "Comparative protein structure modeling of genes and
genomes". Annu Rev Biophys Biomol Struct. 29: 291–325

threading and homology modelling methods

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to threading and homology modelling methods

Similar to threading and homology modelling methods (20)

Recently uploaded

Recently uploaded (20)

threading and homology modelling methods