Abstract
The estimation of free energy of binding is a key problem in structure-based design. We developed the scoring function HYDE based on a consistent description of HYdrogen bond and DEhydration energies in protein–ligand complexes. HYDE is applicable to all types of protein targets since it is not calibrated on experimental binding affinity data or protein–ligand complexes. The comprehensible atom-based score of HYDE is visualized by applying a very intuitive coloring scheme, thereby facilitating the analysis of protein–ligand complexes in the lead optimization process. In this paper, we have revised several aspects of the former version of HYDE which was described in detail previously. The revised HYDE version was already validated in large-scale redocking and screening experiments which were performed in the course of the Docking and Scoring Symposium at 241st ACS National Meeting. In this study, we additionally evaluate the ability of the revised HYDE version to predict binding affinities. On the PDBbind 2007 coreset, HYDE achieves a correlation coefficient of 0.62 between the experimental binding constants and the predicted binding energy, performing second best on this dataset compared to 17 other well-established scoring functions. Further, we show that the performance of HYDE in large-scale redocking and virtual screening experiments on the Astex diverse set and the DUD dataset respectively, is comparable to the best methods in this field.









Similar content being viewed by others
References
Jorgensen WL (2004) The many roles of computation in drug discovery. Science 303:813–1818
Matter H, Sotriffer C (2011) In: Sotriffer C (ed) Virtual screening: principles, challenges and practical guidelines, 1st edn. Wiley-VCH, Weinheim
Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Mod 49:1079–1093
Moitessier N, Englebienne P, Lee D, Lawandi J, Corbeil CR (2008) Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to go. Br J Pharmacol 153:7–26
Sotriffer C, Matter H (2011) In: Sotriffer C (ed) Virtual screening: principles, challenges and practical guidelines, 1st edn. Wiley-VCH, Weinheim
Böhm HJ (1994) The development of a simple empirical scoring function to estimate the binding constant for a protein–ligand complex of known three-dimensional structure. J Comput Aided Mol Design 8:243–256
Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261:470–489
Savage HJ, Elliott CJ, Freeman CM, Finney JL (1993) Lost hydrogen bonds and buried surface area: rationalising stability in globular proteins. J Chem Soc, Faraday Trans 89:2609–2617
Bissantz C, Kuhn B, Stahl M (2010) A medicinal chemist’s guide to molecular interactions. J Med Chem 53(14):5061–5084
Pham TA, Jain AN (2006) Parameter estimation for scoring protein–ligand interactions using negative training data. J Med Chem 49:5856–5868
Krammer A, Kirchhoff PD, Jiang X, Venkatachalam CM, Waldman M (2005) LigScore: a novel scoring function for predicting binding affinities. J Mol Graph Model 23:395–407
Friesner RA, Murphy RB, Repasky MP, Frye LL, Greenwood JR, Halgren TA, Sanschagrin PC, Mainz DT (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes. J Med Chem 49:6177–6196
Sotriffer CA, Sanschagrin P, Matter H, Klebe G (2008) SFCscore: scoring functions for affinity prediction of protein–ligand complexes. Proteins 73:395–419
Mysinger MM, Shoichet BK (2010) Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model 50:1561–1573
Kellogg GE, Burnett JC, Abraham DJ (2001) Very empirical treatment of solvation and entropy: a force field derived from Log Po/w. J Comput Aided Mol Des 15:381–393
Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput Aided Mol Des 16:11–26
Reulecke I, Lange G, Albrecht J, Klein R, Rarey M (2008) Towards an integrated description of hydrogen bonding and dehydration: reducing false positives in virtual screening with the hyde scoring function. ChemMedChem 3(6):885–897
Lange G, Klein R, Albrecht J, Rarey M, Reulecke I (2010) European patent specification EP2084520
Schneider N, Hindle S, Lange G, Klein R, Albrecht J, Briem H, Beyer K, Claußen H, Gastreich M, Lemmen C, Rarey R (2012) Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function. J Comput Aided Mol Des 26:701–723
Richards FM (1977) Areas, volumes, packing, and protein structures. Ann Rev Biophys Bioeng 6:151–176
Connolly ML (1983) Solvent-accessible surfaces of proteins and nucleic acids. Science 221:709–713
Connolly ML (1983) Analytical molecular surface calculation. J Appl Cryst 16:548–558
Stefano Forli, Olson AJ (2012) A force field with discrete waters and desolvation entropy for hydrated ligand docking. J Med Chem 55:623–638
Schneider N, Klein R, Lange G, Rarey M (2012) Nearly no scoring function without a Hansch-analysis. Mol Inf 31:503–507
Stahl M (2000) Modifications of the scoring function in FlexX for virtual screening applications. Perspect Drug Discov 20:83–98
LeadIT. BioSolveIT GmbH, Sankt Augustin. http://www.biosolveit.de/leadit/. Accessed 12 June 2012
Physprop database. http://www.syrres.com/esc/physprop.htm. Accessed 12 June 2012
Hansch C, Leo AJ (1985) Medchem project issue no. 26. Pomona College, Claremont, CA
Hansch C, Leo AJ (1987) The log P database. Pomona College, Claremont, CA
Hansch C, Leo A, Hoekman D (1995) Exploring QSAR. Hydrophobic, electronic, and steric constants. American Chemical Society, Washington, DC
Leo AJ (1993) Calculating log Poct from structures. Chem Rev 93:1281–1306
Lee B, Richards FM (1971) The interpretation of protein structures: estimation of static accessibility. J Mol Biol 55:379–400
Shrake A, Rupley JA (1973) Environment and exposure to solvent of protein atoms, lysozyme and insulin. J Mol Biol 79:351–371
Bondi A (1964) Van der Waals volumes and radii. J Phys Chem 68:441–451
Hartshorn MJ, Verdonk ML, Chessari G, Brewerton SC, Mooij WTM, Mortenson PN, Murray CW (2007) Diverse, high-quality test set for the validation of protein–ligand docking performance. J Med Chem 50:726–741
Seebeck B, Reulecke I, Kämper A, Rarey M (2008) Modeling of metal interaction geometries for protein–ligand docking. Protein Struct Funct Bioinform 71:1237–1254
Lippert T, Rarey M (2009) Fast automated placement of polar hydrogen atoms in protein–ligand complexes. J Cheminf 1:13
Wang R, Fang X, Lu Y, Wang S (2004) The PDBbind database: collection of binding affinities for protein–ligand complexes with known three-dimensional structures. J Med Chem 47:2977–2980
Wang R, Fang X, Lu Y, Yang CY, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48:4111–4119
Jones G, Willett P, Glen RC (1995) Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. J Mol Biol 245:43–53
Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748
Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Improved protein–ligand docking using GOLD. Proteins 52:609–623
Korb O, Stützle T, Exner TE (2006) PLANTS: application of ant colony optimization to structure-based drug design. Lect Notes Comput Sci 4150:247–258
Korb O, Stützle T, Exner TE (2007) An ant colony optimization approach to flexible protein–ligand docking. Swarm Intel 1(2):115–134
Korb O, Stützle T, Exner TE (2009) Empirical scoring functions for advanced protein–ligand docking with PLANTS. J Chem Inf Mod 49:84–96
Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789–6801
Baum B, Mohamed M, Zayed M, Gerlach C, Heine A, Hangauer D, Klebe G (2009) More than a simple lipophilic contact: a detailed thermodynamic analysis of nonbasic residues in the s1 pocket of thrombin. J Mol Biol 390:56–69
Regan J, Breitfelder S, Cirillo P, Gilmore T, Graham AG, Hickey E, Klaus B, Madwed J, Moriak M, Moss N, Pargellis C, Pav S, Proto A, Swinamer A, Tong L, Torcellini C (2002) Pyrazole urea-based inhibitors of p38 MAP kinase: from lead compound to clinical candidate. J Med Chem 45:2994–3008
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242
Urbaczek S, Kolodzik A, Fischer JR, Lippert T, Heuser S, Groth I, Schulz-Gasch T, Rarey M (2011) NAOMI—on the almost trivial task of reading molecules from different file formats. J Chem Inf Mod 51:3199–3207
Tang YT, Marshall GR (2011) PHOENIX: a scoring function for affinity prediction derived using high-resolution crystal structures and calorimetry measurements. J Chem Inf Mod 51:214–228
Sondergaard CR, Garrett AE, Carstensen T, Pollastri G, Nielsen JE (2009) Structural artifacts in protein–ligand X-ray structures: implications for the development of docking scoring functions. J Med Chem 52:5673–5684
Sadowski J, Gasteiger J, Klebe G (1994) Comparison of automatic three-dimensional model builders using 639 X-ray structures. J Chem Inf Comput Sci 34:1000–1008
CORINA. Molecular Networks GmbH, Erlangen, Germany. http://www.molecular-networks.com/products/corina. Accessed 12 June 2011
Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182
Repasky MP, Murphy RB, Banks JL, Greenwood JR, Tubert-Brohman I, Bhat S, Friesner RA (2012) Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide. J Comput Aided Mol Des 26:787–799
Liebeschuetz JW, Cole JC, Korb O (2012) Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test. J Comput Aided Mol Des 26:737–748
Neves MAC, Totrov M, Abagyan R (2012) Docking and scoring with ICM: the benchmarking results and strategies for improvement. J Comput Aided Mol Des 26:675–686
McGann M (2011) FRED pose prediction and virtual screening accuracy. J Chem Inf Mod 51(3):578–596
Brozell SR, Mukherjee S, Balius TE, Roe DR, Case DA, Rizzo RC (2012) Evaluation of DOCK 6 as a pose generation and database enrichment tool. J Comput Aided Mol Des 26:749–773
Acknowledgments
The authors want to thank Hans Briem and Kristin Beyer of Bayer Pharma AG and Jürgen Albrecht of Bayer CropScience AG for many fruitful discussions and a successful cooperation. We also thank Holger Claussen, Marcus Gastreich and Christian Lemmen of BioSolveIT GmbH for their on-going support during the development of HYDE, particularly for the meticulous testing and analysis of HYDE and resulting valuable feedback. The HYDE project was funded by Bayer CropScience AG and Bayer Pharma AG.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Schneider, N., Lange, G., Hindle, S. et al. A consistent description of HYdrogen bond and DEhydration energies in protein–ligand complexes: methods behind the HYDE scoring function. J Comput Aided Mol Des 27, 15–29 (2013). https://doi.org/10.1007/s10822-012-9626-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-012-9626-2