This document describes improving protein-protein docking prediction accuracy while maintaining rapid calculation speed. The researchers:
1) Proposed adding a simple hydrophobic interaction model to MEGADOCK's docking score, which previously only considered shape complementarity and electrostatics.
2) The new model averages atomic contact energy values for receptor surface atoms, allowing hydrophobic interactions to be incorporated with one convolution calculation instead of multiple as in other methods.
3) Testing on 176 protein complexes showed the new model achieved similar prediction accuracy to ZDOCK while being over 8 times faster, maintaining MEGADOCK's rapid calculation ability.
Plant propagation: Sexual and Asexual propapagation.pptx
Accurate protein-protein docking with rapid calculation
1. Improvement of the Protein-Protein Docking Prediction
by Introducing a Simple Hydrophobic Interaction Model:
an Application to Interaction Pathway Analysis
Masahito Ohue† ‡ Yuri Matsuzaki† Takashi Ishida† Yutaka Akiyama†
† Graduate School of Information Science and Engineering,
Tokyo Institute of Technology
‡ JSPS Research Fellow
The 7th IAPR International Conference on
Pattern Recognition in Bioinformatics
November 9th, 2012, Tokyo, Japan
2. Summary
• MEGADOCK is a PPI network prediction system
using all-to-all protein docking calculation
– Required a rapidly calculation for all-to-all prediction
– MEGADOCK is 8.8 times faster than ZDOCK
• We proposed new docking score model
added a hydrophobic interaction keeping rapidness
• We achieved the better level of accuracy without
increasing calculation time
2012/11/09 PRIB2012 2
4. Background
• Proteins play a key role in biological events
ex) Signal transduction
SOS
Ras Raf1
http://en.wikipedia.org/wiki/Epidermal_growth_factor_receptor
Many proteins display their biological functions
by binding to a specific partner protein at a specific site
2012/11/09 PRIB2012 4
5. Background
• PDB crystal structures are currently increasing
but interacted structures are not enough
Structural coverage of interactions Modified from Stein A, 2011.
While there is structural data for a large part of the interactors (60–80%), the portion of interactions
having an experimental structure or a template for comparative modeling is limited (less than 30%)
→ Computational protein docking is very important
2012/11/09 PRIB2012 5
6. Background
• Protein docking is useful of all-to-all protein-
protein interaction prediction
• 1000 proteins’ combinations
→500,500 (1000x1001/2)
• 500,500 PPI predictions require
43,000 CPU days when using ZDOCK*
• On 400 nodes x 12 cores CPU/node
→ still require 9 days
*Mintseris J, et al. Proteins, 2007.
Need a fast protein docking software for all-to-all
protein-protein interaction prediction
2012/11/09 PRIB2012 6
7. Background
• MEGADOCK has a rapid protein docking engine
150
Avg docking time [min]
120
All-to-all PPI Prediction
90 in 1000 proteins
60 9 days → 1 day
ZDOCK 3.0 MEGADOCK
30
0
MEGADOCK ZDOCK 2.3 ZDOCK 3.0
• However, MEGADOCK’s docking accuracy is
worse compared to ZDOCKs
• Our purpose is developing more accurate protein
docking method with rapidness
2012/11/09 PRIB2012 7
8. Protein Docking Problem
Protein monomeric Protein complex
structure pair (predicted structure)
Docking calculation
2012/11/09 PRIB2012 8
10. 3-D Grid Convolution by FFT
• 3-D Grid Convolution Katchalski-Katzir E, et al. PNAS, 1992.
Ligand Receptor Δθ is 15 degree → 3600
patterns of Ligand rotations
(2-D image)
Parallel translation
Docking Score
Convolution
Convolution using FFT
2012/11/09 PRIB2012 10
11. Previous Score Model (MEGADOCK)
• MEGADOCK’s docking score Ohue M, et al. IPSJ TOM, 2010.
Convolution
rPSC ELEC
– Calculate shape complementarity and electrostatics
with only 1 convolution
2012/11/09 PRIB2012 11
13. Comparison of Convolutions
• ZDOCK 3.0 convolution Mintseris J, et al. Proteins, 2007.
Shape Complementarity = Fit neatly +i Clash
Electrostatics = Coulomb force
Desolvation free energy 1 = atom a’s effect +i atom b’s effect
Desolvation free energy 2 = atom c’s effect +i atom d’s effect
・・・
Desolvation free energy 6 = atom k’s effect +i atom l’s effect
• MEGADOCK convolution
Docking Score = Shape Complementarity
(rPSC) +i Coulomb force
2012/11/09 PRIB2012 13
14. Accurate and Fast Docking
• For more accurate protein docking
– Considering hydrophobic interaction
– Previous methods require high calculation cost
• ZDOCK 2.3・・・2 convolutions
• ZDOCK 3.0・・・8 convolutions
• For keeping rapid calculation
– We should keep 1 time convolution
• We proposed simple hydrophobic interaction
model with keeping 1 convolution
2012/11/09 PRIB2012 14
16. Proposed Model
• Implementation of hydrophobic interaction
– Used Atomic Contact Energy score
• Atomic Contact Energy (ACE) Zhang C, et al. J Mol Biol, 1997.
– The empirical atomic pairwise potential for estimating
desolvation free energies
– Used by ZDOCK, FireDock*, etc. *Andrusier N, et al. Proteins, 2007.
• How to add the ACE to MEGADOCK?
– Add the averaged ACE value (non-pairwise) of surface
atom of Receptor
Modified rPSC Model
2012/11/09 PRIB2012 16
17. Modification of rPSC Model
• Considered hydrophobic surface of the receptor
using non-pairewise ACE
1+H 2+H 3+H 3+H 3+H 2+H 1+H
2+H -27 -27 -27 -27 -27 2+H
3+H -27 -27 -27 -27 -27 2+H 1 1
3+H -27 -27 -27 5+H 2+H 1+H 1 1 1 1 1
3+H -27 -27 -27 5+H 2+H 1+H 1 1 1 1 1
3+H -27 -27 -27 -27 -27 2+H 1 1
Ligand
2+H -27 -27 -27 -27 -27 2+H
1+H 2+H 3+H 3+H 3+H 2+H 1+H
Receptor
Sum of near atoms’ ACE (space)
2012/11/09 PRIB2012 (inside of the receptor) 17
18. Pairwise Potential
• Using table of contact energy
Receptor Ligand
eiN × Set 1 on
FFT position of
N atoms eij N Cα C …
N -0.724 -0.903 -0.722 …
Receptor Ligand
Cα -0.119 -0.842 -0.850 …
eiCα × Set 1 on
FFT position of C -0.008 -0.077 -0.704 …
Cα atoms … … … … …
Receptor Ligand
eiC × Set 1 on
FFT position of
←Need (#type of atoms/2) times convolutions
C atoms
・
・
・
18
19. Averaged (Non-Pairwise) Potential
• Using averaged contact energy
eij N Cα C … ei
Receptor Ligand
N -0.724 -0.903 -0.722 … -0.495
ei Set 1 on
×
FFT position of Cα -0.119 -0.842 -0.850 … -0.553
all atoms
C -0.008 -0.077 -0.704 … -0.464
↑ 1 time FFT … … … … … …
Pros : high speed calculation
Cons: poor accuracy
19
20. Experimental Settings
• Dataset: Protein-protein docking benchmark 4.0
– 176 complex structures Hwang H, et al. Proteins, 2010.
(include both bound/unbound form)
1CGI receptor
bound Predicted 1CGI
(re-docking) using 1CGI_r and 1CGI_l
PDB ID: 1CGI
1CGI ligand
unbound 2CGA
(real problem) Predicted 1CGI
using 2CGA and 1HPT
1HPT
2012/11/09 PRIB2012 20
21. Proposed Docking Score
• New docking score
– Calculate 3 physico-chemical effects
by only 1 convolution calculation
2012/11/09 PRIB2012 21
22. Experimental Settings
• How to evaluate accuracy? → Success Rank
Docking score Ligand-RMSD
1st
S=2132.1 35.1Å
Success Rank
2nd
S=1802.3 2.3Å of this case is 2
3rd
S=1623.5 4.1Å Near native solution
(L-RMSD < 5Å)
・・・
Native
・・・
3600th
S=300.8 24.8Å
Ligand-RMSD (root mean square deviation)
RMSD with respect to the X-ray structure, calculated for the all heavy atoms of the ligand residues
when only the receptor proteins (predicted structure and X-ray structure) are superimposed.
2012/11/09 PRIB2012 22
24. Docking Calculation Time
On TSUBAME 2.0, 1 CPU core (Intel Xeon 2.93 GHz)
400
Total time for 176 docking [hr]
300
200
good
100
Comparable
0
MEGADOCK Proposed ZDOCK 2.3 ZDOCK 3.0
x8.8 faster than ZDOCK 3.0 / x3.8 faster than ZDOCK 2.3
2012/11/09 PRIB2012 24
25. Accuracy in Bound Set
ex) The number of cases of “Success Rank ≦ 100”
good is 147 (84%(147/176) of all cases).
100%
80%
Success Rate
60%
ZDOCK 3.0
40% ZDOCK 2.3
Proposed
20%
MEGADOCK
0%
1 10 100 1000
Number of Predictions
Achieved the same level of accuracy as ZDOCK 3.0
2012/11/09 PRIB2012 25
26. Accuracy in Unbound Set
good ex) The number of cases of “Success Rank ≦ 1000”
100% is 46 (26%(46/176) of all cases).
ZDOCK 3.0
80% ZDOCK 2.3
Success Rate
Proposed
60% MEGADOCK
40%
20%
0%
1 10 100 1000
Number of Predictions
Achieved the same level of accuracy as ZDOCK 2.3
2012/11/09 PRIB2012 26
27. Prediction Example
PDB ID: 1BVN
(Unbound Success Rank:
11(MEGADOCK)→1(Proposed))
Receptor: α-amylase
Ligand: Tendamistat
Another angle
(α-amylase inhibitor)
The 1st rank predicted structure
by MEGADOCK (wrong position)
Hydrophobicity scale*
hydrophilic hydrophobic
2012/11/09 PRIB2012 *Eisenberg D, et al. J Mol Biol, 1984. 27
28. Application to Pathway Analysis
• Bacterial chemotaxis signaling pathway prediction
– Predicting “interact or not” from docking results*
* Matsuzaki Y, et al. JBCB, 2009.
• Method ZRANK
Rank of
energy score: ZR
S1 ZR1
Docking S2 ZR2 Mean of S μ
calculation S.D. of S σ
S3 ・
・
・
・
・
・
ZR3,273
S6,000
・
・
・
2012/11/09 PRIB2012 28
29. Application to Pathway Analysis
• Bacterial chemotaxis signaling pathway prediction
– Predicting “interact or not” from docking results*
* Matsuzaki Y, et al. JBCB, 2009. d A(L) B R W D Y C Z X FliG FliM FliN
Tsr ◆ ◆ ◆ ◆ ◆
A(L)
◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Known PPI
B ◆ ◆ ◆ ◆ ◆ Predicted
R ◆ ◆
W ◆ ◆ ◆ ◆ ◆ ◆
D ◆ ◆ ◆ ◆
Y ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆
C ◆ ◆ ◆ ◆ ◆
Z ◆ ◆ ◆ ◆
X ◆ ◆ ◆ ◆ ◆ ◆
FliG ◆ ◆ ◆ ◆ ◆
FliM ◆
– Results (F-measure)
FliN ◆ ◆ ◆
MEGADOCK Proposed ZDOCK 3.0
0.43 0.45 0.49
2012/11/09 PRIB2012 29
31. Conclusion
• We proposed novel docking score including
simple hydrophobic interaction
• We achieved the better level of accuracy without
increasing calculation time
– Same level of accuracy as ZDOCK 2.3
– 8.8 times faster than ZDOCK 3.0
3.8 times faster than ZDOCK 2.3
• Future works
– Developing a hydrophobic interaction model considering
both receptor and ligand protein
– Applying to large PPI network and cross-docking of
structure ensemble
2012/11/09 PRIB2012 31
32. Acknowledgements
• Akiyama Lab. Members
• This work was supported in part by TECH chan
– a Grant-in-Aid for JSPS Fellows, 23・8750
– a Grant-in-Aid for Research and Development of The Next-Generation
Integrated Life Simulation Software
– Educational Academy of Computational Life Sciences
from the Ministry of Education, Culture, Sports, Science and
Technology of Japan (MEXT).
32