6. DOCKING TOOLS
Docking Software Docking Algorithm
⢠DOCK Shape fitting
⢠AutoDock Lamarckian algorithm,
Genetic algorithm
⢠GOLD Genetic Algorithm
⢠GLIDE Monte Carlo sampling
⢠LigandFit Monte Carlo sampling
7. Types of docking
ďLock and KeyRigid Docking â In rigid docking, both the
internal geometry of the receptor and ligand is kept fixed and
docking is performed.
ď Induced fitFlexible Docking - An enumeration on the
rotations of one of the molecules (usually smaller one) is
performed. Every rotation the surface cell occupancy and
energy is calculated; later the most optimum pose is selected
8. ⢠Historically the first approaches.
⢠Protein and ligand are fixed.
⢠Search for the relative orientation of the
two molecules with lowest energy.
⢠Protein-Protein Docking
⢠Both molecules usually considered
rigid
⢠First apply steric constraints to limit
search space and the examine
energetics of possible binding
conformations
Rigid Docking
9. Flexible docking
⢠Protein-Ligand Docking
⢠Flexible ligand, rigid-
receptor
⢠Search space much larger
⢠Either reduce flexible ligand
to rigid fragments
connected by one or several
hinges, or search the
conformational space using
monte-carlo methods or
molecular dynamics
10.
11. Kinds of Docking
⢠Bound docking
⢠Unbound docking
⢠Global docking
⢠Local docking
12. Bound docking and Unbound
docking
â˘The complex structure is known.
The receptor and the ligand in the
complex are pulled apart and
reassembled.
â˘In bound docking the goal is to
reproduce a known complex where
the starting coordinates of the
individual molecules are taken from
the crystal of the complex
⢠Individually determined
protein structures are
used.
â˘In the unbound docking,
which is a significantly more
difficult problem, the
starting coordinates are
taken from the unbound
molecules
13. Global docking
⢠The general problem includes a search for the
location of the binding site and a search to figure out
the exact orientation of the ligand in the binding site.
A program that do both makes a Global docking
⢠Global docking is more demanding in terms of
computational time and the results are less accurate
14. Local docking
⢠Sometimes the location of the binding site is known.
In this case we only need to orient the ligand in the
binding site. In this case the problem is called Local
docking
15. Methodological advances
⢠Inverse docking-small molecules of interest are
dock into library of receptor.
⢠Covalent docking-it is used to study the covalent
character between ligand and receptor. It provides
stronger binding affinity that prolongs the duration
of biological effects
16. ďDetermine all possible optimal conformation for a given complex
(protein-ligand/ protein-protein)
ďCalculate the energy of resulting complex & of each individual
interactions.
Conformational search strategies include-
⢠Systematic method
⢠Random method
⢠Simulation method
Search Algorithm
17. ⢠it uses incremental construction and conformational search
databases
⢠This search algorithm explores all the degree of freedom in a
molecule.
⢠Ligands are often incremenatlly grown into the active site.
⢠Step wise or incremental search can be accomplished in
different ways
⢠While docking various molecular fragments into the active
site region and linking them covalently or alternatively by
dividing dock ligands into rigid (core fragment) and by
flexible(side chain)
Systematic Search
18. Systematic Search ContdâŚ
⢠Once the rigid core is defined they are dock into the
active site.
Flexible regions are added in an incremental fashion.
Another method of systematic search is use of
library of pre-generated conformations.
library conformations are typically only calculated
once and the search problem is therefore reduced to
rigid body docking procedure.
19. Random search
⢠This method operate by making random change to either
single ligand or population of ligand.
⢠A newly obtained ligand is evaluated on the bases of pre
defined probability function.
⢠Basic idea is to take into consideration of already explored
area of conformation space.
⢠To determine if a molecular conformation is accepted or
not, the root mean square value is calculated between
current molecular coordinates and every previously recorded
conformations.
⢠Random search uses two algorithms-
ďMonte Carlo algorithm
ďGenetic algorithm
20. Simulation Search
⢠It uses algorithms like molecular dynamics and energy
minimization.
⢠In this approach, proteins are typically held rigid, and the
ligand is allowed to freely explore their conformational space.
⢠The generated conformations are then docked successively
into the protein, and an MD simulation consisting of
a simulated annealing protocol is performed.
⢠This is usually supplemented with short MD energy
minimization steps, and the energies determined from the
MD runs are used for ranking the overall scoring. Although
this is a computer-expensive method (involving potentially
hundreds of MD runs).
21. ⢠The evaluation and ranking of predicted ligand conformations
is a crucial aspect of structure-based virtual screening.
⢠Scoring functions implemented in docking programs make
various assumptions and simplifications in the evaluation of
modeled complexes
⢠They do not fully account for a number of physical
phenomena that determine molecular recognition â for
example, entropic effects.
contdâŚ
Scoring Function
22. ⢠Affinity scoring functions are applied to the energetically
best pose or n best poses found
for each molecule, and comparing the affinity scores for
different molecules gives their
relative rank-ordering.
⢠Essentially, following types or classes of scoring functions
are currently applied:
1. Force-field-based scoring
2. Empirical scoring functions
3. Knowledge-based scoring functions
4. Consensus scoring
5. Shape & Chemical Complementary Scores
23. Classes of scoring function
⢠Broadly speaking, scoring functions can be divided into the
following classes:
⢠Forcefield-based
⢠Based on terms from molecular mechanics forcefields
⢠GoldScore, DOCK, AutoDock
⢠Empirical
⢠Parameterised against experimental binding affinities
⢠ChemScore, PLP, Glide SP/XP
⢠Knowledge-based potentials
⢠Based on statistical analysis of observed pairwise
distributions
⢠PMF, DrugScore, ASP
25. Shape & Chemical Complementary
Scores
⢠Divide accessible protein surface into zones:
â Hydrophobic
â Hydrogen-bond donating
â Hydrogen-bond accepting
⢠Do the same for the ligand surface
⢠Find ligand orientation with best complementarity score
27. BĂśhmâs empirical scoring
function
⢠This scoring function is an empirical scoring function
⢠Empirical = incorporates some experimental data
⢠The coefficients (âG) in the equation were determined using
multiple linear regression on experimental binding data for 45
proteinâligand complexes
⢠Although the terms in the equation may differ, this general
approach has been applied to the development of many
different empirical scoring functions
contdâŚ
28. BĂśhmâs empirical scoring
function
⢠In general, scoring functions assume that the free
energy of binding can be written as a linear sum of
terms to reflect the various contributions to binding.
⢠Bohmâs scoring function included contributions
from hydrogen bonding, ionic interactions, lipophilic
interactions and the loss of internal conformational
freedom of the ligand.
29. Here,
⢠The âG values on the right of the equation are all constants.
⢠âGo is a contribution to the binding energy that does not
directly depend on any specific interactions with the protein
⢠The hydrogen bonding and ionic terms are both
dependent on the geometry of the interaction, with large
deviations from ideal geometries (ideal distance R, ideal angle
Îą) being penalized.
30. Knowledge-based Scoring
Function
⢠Knowledge-based scoring functions are designed to
reproduce experimental structures rather than binding
energies.
⢠Free energies of molecular interactions are derived from
structural information on Protein-ligand complexes contained
in PDB.
⢠Boltzmann-Like Statistics of Interatomic
Contacts suggests:
â˘
( ) ( )[ ]lpreflp FPP ssbss ,exp, -=
32. O
Ligand
OH
Protein
OH
Protein
O
Ligand
OLigand
OH
Protein
(Invented) distribution of a particular pairwise interaction
0
200
400
600
800
1000
1200
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Distance (Angstrom)
Numberofobservations
For example, creating the distributions of ligand carbonyl oxygens to protein
hydroxyl groups:
(imagine the minimum at 3.0Ang)
Knowledge-based potentials
33. Force Field based Scoring
⢠Molecular mechanics force fields usually quantify the sum of
two energies, the receptorâligand interaction energy and
internal ligand energy(such as steric strain induced by binding).
⢠Most force field scoring functions only consider a single
Protein conformation, which makes it possible to omit the
Calculation of internal protein energy, which greatly simplifies
Scoring.
contdâŚ
35. CONSENSUS SCORING
⢠Consensus scoring combines information from different
scores to balance errors in single scores and improve the
Probability of identifying âtrueâ ligands.
⢠An exemplary implementation of consensus scoring is
X-CSCORE60, which combines GOLD-like, DOCK-like,
ChemScore, PMF and FlexX scoring functions.
37. High Throughput Screening
⢠Popular approach to target validation.
⢠Process of testing a large no. of diverse chemical structures to
identify âHITSâ.
Benefits of HTS:
⢠Allows screening of thousand of compounds on repeatable
basis.
⢠More effective drugs can be developed at fast rate.
⢠Ability to optimize the compound lead selection and
eliminating compounds that do not show measurable activity.
⢠Reduces time and cost effective.
38. VIRTUAL SCREENING
⢠Computational technique used in drug discovery to search
libraries of small molecules in order to identify those
structures which are most likely to bind to a drug target,
typically a protein receptor or enzyme.
⢠Virtual screening uses computer based methods to discover
new ligands on the basis of biological structures.
⢠There are two broad categories of screening techniques:
I. Ligand-based and
II. Structure-based
40. Step 1: Preparation of input files:
ď Ligand preparation:
⢠Assign charges
⢠Define rotatable bonds
⢠Rename aromatic carbons
⢠Merge non-polar hydrogens
⢠Write .pdbqt ligand file
⢠Ligands can be obtained from various databases
like ZINC, PubChem or can be sketched using tools like
Chemsketch
⢠While selecting the ligand, the LIPINSKYâS RULE OF 5
should be applied.
41. ď Protein preparation:
⢠-Add essential hydrogens
-Load charges
-Merge lone-pairs
-Add solvation parameters
-Write .pdbqt protein file
⢠PDB structures often contain water molecules
In general, all water molecules are removed except where it is
known that they play an important role in coordinating to
the ligand.
⢠PDB structures are missing all hydrogen atoms.
Many docking programs require the protein to have explicit
hydrogens.
contd...
42. ďŽ Ligand-protein interaction
energies are pre-calculated and
then used as a look-up table
during simulation
ďŽ Grid maps are constructed
based on atoms of interest in
ligand.
Step 2: Docking Preparation â Grid
43.
44.
45.
46.
47.
48.
49. Key pointsâŚ
⢠rmsd/lb (RMSD lower bound) and rmsd/ub (RMSD upper
bound), differing in how the atoms are matched in the distance
calculation:
⢠rmsd/ub matches each atom in one conformation with itself
in the other conformation, ignoring any symmetry
⢠rmsd/lb is defined as follows: rmsd/lb(c1, c2) =
max(rmsd'(c1, c2), rmsd'(c2, c1))
⢠polar hydrogens are needed in the input structures to correctly
type heavy atoms as hydrogen bond donors.