2. INTRODUCTION:
Homology modeling, also known as comparative modeling of protein is the technique which allows to
construct an unknown atomic-resolution
model of the "target" protein from:
1. Its amino acid sequence and
2. An experimental 3D structure of a related homologous protein (the "template").
Prediction of the three dimensional structure of a given protein sequence i.e. target protein from the amino
acid sequence of a homologous (template) protein for which an X-ray or NMR structure is available based on
an alignment to one or more known protein structures.
If similarity between the target sequence and the template sequence is detected, structural similarity can be
assumed.
In general, 30% sequence identity is required to generate an useful model.
3.
4.
5.
6. Template recognition and initial alignment
Template recognition & selection involves searching the PDB for homologous proteins with determined
structures. The search can be performed using simple sequence alignment programs such as BLAST or FASTA
as the percentage identity between the Target sequence and a possible template is high enough in the safe
zone, to be detected with these programs.
To obtain a list of hits-the modelling templates and corresponding alignments the program compares the
query sequence to all the sequences of known structures in the PDB using mainly two matrices.
1. A residue exchange matrix
2. An alignment matrix.
8. BACKBONE GENERATION
When the alignment is correct, the backbone of the target can be created. The coordinates of the
template-backbone are copied to the target. When the residues are identical, the side-chain coordinates
are also copied.
LOOP MODELLING
After the sequence alignment, there are often regions created by insertions and deletions that lead to
gaps in alignment. These gaps are modeled by loop modeling, which is less accurate. Currently, two main
techniques are used to approach the problem:
The database searching method - this involves finding loops from known protein structures and
superimposing them onto the two stem regions (main chains mostly) of the target protein. Some
specialized programs like FREAD and CODA can be used.
The ab initio method - this generates many random loops and searches for one that has reasonably low
energy and up and y angles in the allowable regions in the Ramachandran plot.
9. Side-Chain Modelling
This is important in evaluating protein-ligand interactions at active sites and protein-protein interactions at
the contact interface. A side chain can be built by searching every possible conformation for every torsion
angle of the side chain to select the one that has the lowest energy with neighbouring atoms. A rotamer
library can also be used, which has all the favourable side chain torsion angles extracted from known protein
crystal structures.
Model Optimization
energy minimization procedure on the entire model, by adjusting the relative position of the atoms so that
the overall conformation of the molecule has the lowest possible energy potential. The goal is to relieve steric
collisions without altering the overall structure.
Optimization can also be done by Molecular Dynamic Simulation which moves the atoms toward a global
minimum by applying various stimulation conditions (heating, cooling, considering water molecules) thus
having a better chance at finding the true structure.
Energy=Stretching Energy +Bending Energy +Torsion Energy +Non-Bonded Interaction Energy.
10. Model Validation
Every homology model contains errors. Two main reasons are:
1. The percentage sequence identity between template and target. If it is greater than 90%, the accuracy of
the model can be compared to crystallographically determined structures & if less than 30% large error occurs
2. The number of errors in templates
The final model has to be evaluated for checking the o-y angles, chirality, bond lengths, close contacts and
also the stereo chemical properties. Modeling Programs like Modeller, SWISS MODEL. Schrodinger, 3D-
JIGSAW.
A successful model depends on template selection, algorithm used and thevalidation of the model.