Abstract
This paper tests the performance of a simple empirical scoring function on a set of candidate designs produced by a de novo design package. The scoring function calculates approximate ligand-receptor binding affinities given a putative binding geometry. To our knowledge this is the first substantial test of an empirical scoring function of this type on a set of molecular designs which were then subsequently synthesised and assayed. The performance illustrates that the methods used to construct the scoring function and the reliance on plausible, yet potentially false, binding modes can lead to significant over-prediction of binding affinity in bad cases. This is anticipated on theoretical grounds and provides caveats on the reliance which can be placed when using the scoring function as a screen in the choice of molecular designs. To improve the predictability of the scoring function and to understand experimental results, it is important to perform subsequent Quantitative Structure-Activity Relationship (QSAR) studies. In this paper, Bayesian regression is performed to improve the predictability of the scoring function in the light of the assay results. Bayesian regression provides a rigorous mathematical framework for the incorporation of prior information, in this case information from the original training set, into a regression on the assay results of the candidate molecular designs. The results indicate that Bayesian regression is a useful and practical technique when relevant prior knowledge is available and that the constraints embodied in the prior information can be used to improve the robustness and accuracy of regression models. We believe this to be the first application of Bayesian regression to QSAR analysis in chemistry.
Similar content being viewed by others
References
Eldridge, M.D., Murray, C.W., Auton, T.R., Paolini, G.V. and Mee, R.P., J. Comput.-Aided Mol. Design, 11 (1997) 425.
Böhm, H.-J., J. Comput.-Aided Mol. Design, 8 (1994) 243.
Head, R.D., Smythe, M.L., Oprea, T.I., Waller, C.L., Green, S.M. and Marshall, G.R., J. Am. Chem. Soc., 118 (1996) 3959.
Jain, A.J., J. Comput.-Aided Mol. Design, 10 (1996) 427.
Murray, C.W., Clark, D.E., Auton, T.R., Firth, M.A., Li, J., Sykes, R.A., Waszkowycz, B., Westhead, D.R. and Young, S.C., J. Comput.-Aided Mol. Design, 11 (1997) 193.
Young, S.C., Auton, T.R., Clark, D.E., Li, J., Liebeschuetz, J.W., Lowe, R., Mahler, J., Martin, H., Morgan, P.J., Murray, C.W., Rimmer, A.D., Waszkowycz, B.W. and Westhead, D.R., J. Med. Chem., submitted.
O'Hagan, A., Kendall's Advanced Theory of Statistics, Volume 2B: Bayesian Inference, Wiley & Sons Inc., New York, NY, 1994.
Banner, D.W. and Hadvary, P., J. Biol. Chem., 266 (1991) 20085.
Bode, W., Turk, D. and Stürzebecher, J., Eur. J. Biochem., 193 (1990) 175.
DISCOVER, v 2.9.5, Molecular Simulations Inc., San Diego, CA, USA.
CFF95 Forcefield, implemented in DISCOVER 2.9.5., Molecular Simulations Inc., San Diego, CA, USA.
The NAG Fortran Library Manual, Mark 17, The Numerical Algorithms Group Ltd, Oxford, U.K., 1995.
Fersht, A.R., Shi, J., Knill-Jones, J., Lowe, D.M., Wilkinson, A.J., Blow, D.M., Brick, P., Carter, P., Waye, M.M.Y. and Winter, G., Nature, 314 (1985) 235.
Feng, D., Gardell, S.J., Dale Lewis, S., Bock, M.G., Chen, Z., Freidinger, R.M., Nayler-Olsen, A.M., Ramjit, H.M., Woltmann, R., Baskin, E.P., Lynch, J.J., Lucas, R., Shafer, J.A., Danacheck, K.B., Chen, I., Mao, S., Krueger, J.A., Hare, T.R., Mulichak, A.M. and Vacca, J.P., J. Med. Chem., 40 (1997) 3726.
Ajay and Murcko, M.A., J. Med. Chem., 38 (1995) 4953.
Aqvist, J., Medina, C. and Samuelsson, J., Protein Eng., 3 (1994) 385.
Kauvar, L.M., Higgins, D.L., Villar, H.O., Sportsman, J.R., Engqvist-Goldstein, A.E., Bukar, R., Bauer, K.E., Dilley, H.M. and Rocke, D.M., Chem. Biol., 2 (1995) 107.
Leo, A.J., Chem. Rev., 93 (1993) 1281.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Murray, C.W., Auton, T.R. & Eldridge, M.D. Empirical scoring functions. II. The testing of an empirical scoring function for the prediction of ligand-receptor binding affinities and the use of Bayesian regression to improve the quality of the model. J Comput Aided Mol Des 12, 503–519 (1998). https://doi.org/10.1023/A:1008040323669
Issue Date:
DOI: https://doi.org/10.1023/A:1008040323669