Abstract
Due to the large number of different docking programs and scoring functions available, researchers are faced with the problem of selecting the most suitable one when starting a structure-based drug discovery project. To guide the decision process, several studies comparing different docking and scoring approaches have been published. In the context of comparing scoring function performance, it is common practice to use a predefined, computer-generated set of ligand poses (decoys) and to reevaluate their score using the set of scoring functions to be compared. But are predefined decoy sets able to unambiguously evaluate and rank different scoring functions with respect to pose prediction performance? This question arose when the pose prediction performance of our piecewise linear potential derived scoring functions (Korb et al. in J Chem Inf Model 49:84–96, 2009) was assessed on a standard decoy set (Cheng et al. in J Chem Inf Model 49:1079–1093, 2009). While they showed excellent pose identification performance when they were used for rescoring of the predefined decoy conformations, a pronounced degradation in performance could be observed when they were directly applied in docking calculations using the same test set. This implies that on a discrete set of ligand poses only the rescoring performance can be evaluated. For comparing the pose prediction performance in a more rigorous manner, the search space of each scoring function has to be sampled extensively as done in the docking calculations performed here. We were able to identify relative strengths and weaknesses of three scoring functions (ChemPLP, GoldScore, and Astex Statistical Potential) by analyzing the performance for subsets of the complexes grouped by different properties of the active site. However, reasons for the overall poor performance of all three functions on this test set compared to other test sets of similar size could not be identified.







Similar content being viewed by others
References
Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Nat Drug Dis 3:935–949
Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin ThE (1982) J Mol Biol 161:269–288
von Korff M, Freyss J, Sander T (2009) J Chem Inf Model 49:209–231
Cross JB, Thompson DC, Rai BK, Baber JC, Fan KY, Hu Y, Humblet C (2009) J Chem Inf Model 49:1455–1474
Kellenberger E, Rodrigo J, Muller P, Rognan D (2004) Proteins 57:225–242
Krovat EM, Steindl T, Langer T (2005) Curr Comput Aided Drug Des 1:93–102
Perola E, Walters WP, Charifson PS (2004) Proteins 56:235–249
Huang N, Shoichet BK, Irwin JJ (2006) J Med Chem 49:6789–6891
Hevener KE, Zhao W, Ball DM, Babaoglu K, Qi J, White SW, Lee RE (2009) J Chem Inf Model 49:444–460
Cheng T, Li X, Liu Z, Wang R (2009) J Chem Inf Model 49:1079–1093
Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS (2005) J Med Chem 49(20):5912–5931
Englebienne P, Moitessier N (2009) J Chem Inf Model 49:1568–1580
Corbeil CR, Moitessier N (2009) J Chem Inf Model 49:997–1009
Chikji A, Bensegueni A (2008) J Proteomics Bioinform 1:161–165
Li X, Li Y, Cheng T, Liu Z, Wang R (2010) J Comput Chem 31:2109–2125
Korb O, Stützle T, Exner TE (2009) J Chem Inf Model 49:84–96
Korb O, Stützle T, Exner TE (2006) Lect Notes Comput Sci 4150:247–258
Korb O, Stützle T, Exner TE (2007) Swarm Intell 1:115–134
Nissink JWM, Murray CW, Hartshorn MJ, Verdonk ML, Cole JC, Taylor R (2002) Proteins 49:457–471
Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) J Mol Biol 267:727–748
Jones G, Willett P, Glen RC (1995) J Mol Biol 245:43–53
O’Boyle NM, Liebeschuetz JW, Cole JC (2009) J Chem Inf Model 49:1871–1878
Okamoto M, Masuda Y, Muroya A, Yasuno K, Takahashi O, Furuya T (2010) Chem Pharm Bull 58(12):1655–1657
Huang SY, Grinter SZ, Zou X (2010) Phys Chem Chem Phys 12(40):12899–12908
Zhong S, Zhang Y, Xiu Z (2010) Curr Opin Drug Discov Devel 13(3):326–334
Bar-Haim S, Aharon A, Ben Moshe T, Marantz Y, Senderowitz H (2009) J Chem Inf Model 49(3):623–633
Fukunishi H, Teramoto R, Takada T, Shimada J (2008) J Chem Inf Model 48(5):988–996
Teramoto R, Fukunishi H (2008) J Chem Inf Model 48(4):747–754
Teramoto R, Fukunishi H (2008) J Chem Inf Model 48(2):288–295
Renner S, Derksen S, Radestock S, Moerchen F (2008) J Chem Inf Model 48(2):319–332
Wolf A, Zimmermann M, Hofmann-Apitius M (2007) J Chem Inf Model 47(3):1036–1044
Teramoto R, Fukunishi H (2007) J Chem Inf Model 47(2):526–534
Betzi S, Suhre K, Chetrit B, Guerlesquin F, Morelli X (2006) J Chem Inf Model 46(4):1704–1712
Oda A, Tsuchida K, Takakura T, Yamaotsu N, Hirono S (2006) J Chem Inf Model 46(1):380–391
Miteva MA, Lee WH, Montes MO, Villoutreix BO (2005) J Med Chem 48(19):6012–6022
Xing L, Hodgkin E, Liu Q, Sedlock D (2004) J Comput Aided Mol Des 18(5):333–344
Clark RD, Strizhev A, Leonard JM, Blake JF, Matthew JB (2002) J Mol Graph Model 20(4):281–295
Charifson PS, Corkery JJ, Murcko MA, Walters WP (1999) J Med Chem 42(25):5100–5109
Mooij WT, Verdonk ML (2005) Proteins 61:272–287
Gehlhaar DK, Verkhivker GM, Rejto PA, Sherman CJ, Fogel DB, Fogel LJ, Freer ST (1995) Chem Biol 2:317–324
Verkhivker GM, Bouzida D, Gehlhaar DK, Rejto PA, Freer ST, Rose PW (2002) Proteins 48:539–557
Verkhivker GM, Bouzida D, Gehlhaar DK, Rejto PA, Freer ST, Rose PW (2003) Proteins 53:201–219
Verkhivker GM (2004) J Mol Graph Model 22:335–348
Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Proteins 52:609–623
Clark M, Cramer RD III, Van Opdenbosch N (1989) J Comput Chem 10:982–1012
Hartshorn MJ, Verdonk ML, Chessari G, Brewerton SC, Mooij WTM, Mortenson PN, Murray CW (2007) J Med Chem 50:726–741
Panigrahi SK (2008) Amino Acids 34:617–633
Panigrahi SK, Desiraju GR (2007) Proteins 67:128–141
Nelder JA, Mead R (1965) Comput J 7:308–313
Pencheva T, Soumana OS, Pajeva I, Miteva MA (2010) Eur J Med Chem 45:2622–2628
Keil M, Exner TE, Brickmann J (2003) J Comput Chem 25(6):779–789
Waldherr-Teschner M, Goetze T, Heiden W, Knoblauch M, Vollhardt H, Brickmann J (1992) MOLCAD—computer aided visualization and manipulation of models in molecular science. In: Post FH, Hin AJS (eds) Advances in scientific visualization. Springer Verlag, Heidelberg, pp 58–67
Brickmann J, Goetze T, Heiden W, Moeckel G, Reiling S, Vollhardt H, Zachmann C-D (1995) Interactive Visualization of Molecular Scenarios with MOLCAD/SYBYL. In: Bowie JE (ed) Data visualisation in molecular science: tools for insight and innovation. Addison-Wesley Publishing Company Inc., Reading, Mass, pp 83–97
Brickmann J, Keil M, Exner TE, Marhöfer R (2000) J Mol Model 6:328–340
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2007) KNIME: the Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Studies in classification, data analysis, and knowledge organization (GfKL 2007). Springer, pp 319–326
ten Brink T, Exner TE (2009) J Chem Inf Model 49:1535–1546
ten Brink T, Exner TE (2010) J Comput Aided Mol Des 24:935–942
Thilagavathi R, Mancera RL (2010) J Chem Inf Model 50:415–421
Verdonk ML, Chessari G, Cole JC, Hartshorn MJ, Murray CW, Nissink JWM, Taylor RD, Taylor R (2005) J Med Chem 38:6504–6515
Ravitz O, Zsoldos Z, Simon A (2011) J Comput Aided Mol Des 25:1033–1051
Seifert MHJ (2009) J Comput Aided Mol Des 23:633–644
Pham TA, Jain AN (2008) J Comput Aided Mol Des 22:269–286
Acknowledgment
The authors thank Renxiao Wang for providing the diverse test set of 195 protein–ligand complexes as well as Colin Groom and John Liebeschuetz for helpful discussions. The work was supported by the Konstanz Research School Chemical Biology (KoRS-CB), the Zukunftskolleg and the Young Scholar Fund of the Universität Konstanz. O.K. acknowledges support of the Landesgraduiertenförderung Baden-Württemberg and the Postdoc-Programme of the German Academic Exchange Service (DAAD). Additionally, we thank the Common Ulm Stuttgart Server (CUSS) and the Baden-Württemberg grid (bwGRiD), which is part of the D-Grid system, for providing the computer resources making the computations possible.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
10822_2011_9539_MOESM1_ESM.pdf
Success rates for the 16 different scoring functions of the original study and the 4 scoring functions described in this paper can be found in the supporting information. Binding scores and rmsd values for the best-identified decoy as well as the best-ranked poses of the full docking for each individual complex are also given. Finally, plots showing rmsd values versus the total surface area of the ligand and the binding affinities are available. (PDF 1419 kb)
Rights and permissions
About this article
Cite this article
Korb, O., ten Brink, T., Victor Paul Raj, F.R.D. et al. Are predefined decoy sets of ligand poses able to quantify scoring function accuracy?. J Comput Aided Mol Des 26, 185–197 (2012). https://doi.org/10.1007/s10822-011-9539-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-011-9539-5