Abstract
A central problem in forming accurate regression equations in QSAR studies isthe selection of appropriate descriptors for the compounds under study. Wedescribe a novel procedure for using inductive logic programming (ILP) todiscover new indicator variables (attributes) for QSAR problems, and show thatthese improve the accuracy of the derived regression equations. ILP techniqueshave previously been shown to work well on drug design problems where thereis a large structural component or where clear comprehensible rules arerequired. However, ILP techniques have had the disadvantage of only being ableto make qualitative predictions (e.g. active, inactive) and not to predictreal numbers (regression). We unify ILP and linear regression techniques togive a QSAR method that has the strength of ILP at describing stericstructure, with the familiarity and power of linear regression. We evaluatedthe utility of this new QSAR technique by examining the prediction ofbiological activity with and without the addition of new structural indicatorvariables formed by ILP. In three out of five datasets examined the additionof ILP variables produced statistically better results (P < 0.01) over theoriginal description. The new ILP variables did not increase the overallcomplexity of the derived QSAR equations and added insight into possiblemechanisms of action. We conclude that ILP can aid in the process of drugdesign.
Similar content being viewed by others
References
Hansch, C., Maloney, P.P., Fujita, T. and Muir, R.M., Nature, 194 (1962) 178.
Martin, Y.C., Quantitative Drug Design: A Critical Introduction, Marcel Dekker, New York, NY, U.S.A., 1978.
Ramsden, C. (Ed.) Comprehensive Medicinal Chemistry, Vol. 4, Pergamon, Oxford, U.K., 1990.
Trinajstic, N., Chemical Graph Theory, CRC Press, Boca Raton, FL, U.S.A., 1983.
Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J. and Hansch, C., J. Med. Chem., 34 (1991) 786.
Klopman, G., J. Am. Chem. Soc., 106 (1984) 7315.
Klopman, G., Quant. Struct.–Act. Relatsh., 11 (1992) 176.
Hopfinger, A.J., J. Am. Chem. Soc., 102 (1980) 7196.
Cramer, R.D., Patterson, D.E. and Bunce, J.D., J. Am. Chem. Soc., 110 (1988) 5959.
Hansch, C., Li, R.-I., Blaney, J.M. and Langridge, R., J. Med. Chem., 25 (1982) 777.
Muggleton, S. and Feng, C., In Proceedings of the First Conference on Algorithmic Learning Theory, Japanese Society of Artificial Intelligence, Tokyo, Japan, 1990, pp. 368–381.
King, R.D., Muggleton, S., Lewis, R.A. and Sternberg, M.J.E., Proc. Natl. Acad. Sci. USA, 89 (1992) 11322.
Hirst, J.D., King, R.D. and Sternberg, M.J.E., J. Comput.-Aided Mol. Design, 8 (1994) 405.
Hirst, J.D., King, R.D. and Sternberg, M.J.E., J. Comput.-Aided Mol. Design, 8 (1994) 421.
Muggleton, S.H., New Gen. Comput., 13 (1995) 245.
King, R.D., Muggleton, S.H., Srinivasan, A. and Sternberg, M.J.E., Proc. Natl. Acad. Sci. USA, 93 (1996) 438.
King, R.D. and Srinivasan, A., Environ. Health Perspect., 104 (Suppl. 5) (1996) 1031.
DeLong, H., A Profile of Mathematical Logic, Addison-Wesley, Reading, MA, U.S.A., 1970.
Bahler, D. and Bristol, D.W., In Intelligent Systems for Molecular Biology-93, AAI/MIT Press, Menlo Park, CA, U.S.A., 1993.
Lee, Y., Buchanan, B.G., Mattison, D.M., Klopman, G. and Rosenkranz, H.S., Mutat. Res., 328 (1995) 127.
Silipo, C. and Hansch, C., J. Am. Chem. Soc., 97 (1975) 6849.
Davis, A.M., Gensmantel, N.P., Johansson, E. and Marriott, D.P., J. Med. Chem., 37 (1994) 963.
Norusis, M.J., SPSS: Base System User Guide, Release 6.0, SPSS Inc., Chicago, IL, U.S.A., 1994.
Wallace, C.S. and Freeman, P.R., J. R. Statist. Soc., B49 (1987) 195.
Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E. and King, R.D., A.I. Journal, (1997) in press.
Topliss, J. and Edwards, R.P., J. Med. Chem., 22 (1979) 1238.
Wold, S., Technometrics, 20 (1978) 397.
Champness, J.N., Stammers, D.K. and Beddell, C.R., FEBS Lett., 199 (1986) 61.
Matthews, D.A., Bolin, J.T., Burridge, J.M., Filman, D.J., Volz, K.W., Kaufman, B.T., Beddell, C.R., Champness, J.N., Stammers, D.K. and Kraut, J., J. Biol. Chem., 260 (1985) 381.
Roth, B., Aig, E., Rauckman, B.S., Srelitz, J.Z., Phillips, A.P., Ferone, R., Bushby, S.R.M. and Siegel, C.W., J. Med. Chem., 24 (1981) 933.
Andrea, T.A. and Kalayeh, H., J. Med. Chem., 34 (1991) 2824.
So, S.-S. and Richards, W.G., J. Med. Chem., 35 (1992) 3201.
Kubinyi, H., Quant. Struct.–Act. Relatsh., 13 (1994) 285.
Glen, R.A. and Payne, A.W.R., J. Comput.-Aided Mol. Design, 9 (1995) 181.
Michie, D., Spiegelhalter, D.J. and Taylor, C.C., Machine Learning, Neural and Statistical Classification, Ellis Horwood, London, U.K., 1994.
Frank, I.E. and Friedman, J.H., Technometrics, 35 (1993) 109.
Michalski, R.S., In Michalski, R.S., Carbonnel, J. and Mitchell, T. (Eds.) Machine Learning: An Artificial Approach, Morgan Kaufmann, Los Altos, CA, U.S.A., 1986, pp. 83–134.
Lavrac, N. and Dzeroski, S., Inductive Logic Programming Techniques and Applications, Ellis Horwood, London, U.K., 1994.
King, R.D., Srinivasan, A. and Sternberg, M.J.E., New Gen. Comput., 13 (1995) 411.
Muggleton, S., Page, D. and Srinivasan, A., In Inductive Logic Programming 96, Stockholm, Sweden, 1996
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
King, R.D., Srinivasan, A. The discovery of indicator variables for QSAR using inductive logic programming. J Comput Aided Mol Des 11, 571–580 (1997). https://doi.org/10.1023/A:1007967728701
Issue Date:
DOI: https://doi.org/10.1023/A:1007967728701