Abstract
The requirement of aligning each individual molecule in a data set severely limits the type of molecules which can be analysed with traditional structure activity relationship (SAR) methods. A method which solves this problem by using relations between objects is inductive logic programming (ILP). Another advantage of this methodology is its ability to include background knowledge as 1st-order logic. However, previous molecular ILP representations have not been effective in describing the electronic structure of molecules. We present a more unified and comprehensive representation based on Richard Bader’s quantum topological atoms in molecules (AIM) theory where critical points in the electron density are connected through a network. AIM theory provides a wealth of chemical information about individual atoms and their bond connections enabling a more flexible and chemically relevant representation. To obtain even more relevant rules with higher coverage, we apply manual postprocessing and interpretation of ILP rules. We have tested the usefulness of the new representation in SAR modelling on classifying compounds of low/high mutagenicity and on a set of factor Xa inhibitors of high and low affinity.
Similar content being viewed by others
Notes
King et al. [6] wrongly classified one of the inactive compounds as active resulting in 13 active compounds. This explains the higher accuracy reported here.
References
Hansch C (1969) Acc Chem Res 2:232
Hansch C, Dunn WJ III (1964) J Am Chem Soc 86:1616
Hall LH, Kier LB (1991) In: Lipkowitz KB, Boyd DB (eds) Reviews in computational chemistry, vol 2. VCH Publishers, New York, pp 367–422
Cramer RD III, Patterson DE, Bunce JD (1988) J Am Chem Soc 110:5959
Nienhuys-Cheng SH, de Wolf R (1997) Foundations of inductiv logic programming, volume 1228 of Lecture notes in artificial intelligence. Springer-Verlag, Berlin
King RD, Muggleton SH, Srinivasan A, Sternberg JE (1996) Proc Natl Acad Sci USA 93:438
Srinivasan A, Page D, Camacho R, King RD (2006) Mach Learn
Srinivasan A, King RD (1999) Data Min Knowl Disc 3:37
Finn P, Muggleton S, Page D, Srinivasan A (1998) Mach Learn 30:241
Marchant-Geneste N, Watson KA, Alsberg BK, King RD (2002) J Med Chem 45:399
Enot DP, King RD (2003) Lecture Notes in Artificial Intelligence 2838:156
Nattee C, Sinthupinyo S, Numao M, Okada T (2005) In Lecture notes in artificial intelligence vol 3430, pp 92–111. Springer-Verlag, Berlin
Srinivasan A, King RD, Bain ME (2003) J Mach Learn Res 4:369
Bader RFW (1990) Atoms in molecules: A quantum theory. Number 22 in International series of monographs on chemistry. Clarendon Press, Oxford
Alsberg BK, Marchand-Geneste N, King RD (2000) Chemometr Intell Lab 54:75
Alsberg BK, Marchand-Geneste N, King RD (2001) Anal Chim Acta 446:3
Chaudry UA, Popelier PLA (2004) J Org Chem 69:233
Smith PJ, Popelier PLA (2004) J Comput Aid Mol Des 18:135
Chaudry UA, Popelier PLA (2003) J Phys Chem A 107:4578
O’Brian SE, Popelier PLA (2002) J Chem Soc Perkin Trans 2:478
Popelier PLA, Chaudry UA, Smith PJ (2002) J Chem Soc Perkin Trans 2:1231
O’Brian SE, Popelier PLA (2001) J Chem Inf Comput Sci 41:764
Popelier PLA (1999) J Phys Chem A 103:2883
O’Brian SE, Popelier PLA (1999) Can J Chem 77:28
King RD, Marchand-Geneste N, Alsberg BK (2001) Linköping electronic articles in Computer and Information Science 6
Muggleton S, De Raedt L (1994) J Logic Programming 20:629
Kersting K, De Raedt L (2002) Basic principles of learning bayesian logic programs
Debnath AK, Lopez de Compadre RL, Debnath G, Shusterman AJ, Hansch C (1991) Anal Chim Acta 34:786
Fontaine F, Pastor M, Zamora I, Sanz F (2005) J Med Chem 48:2687
Pastor M, Cruciani G, McLay I, Pickett S, Clementi S (2000) J Med Chem 43:3233
Fontaine F, Pastor M, Sanz F (2005) J Med Chem 47:2805
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Zakrzewski VG, Montgomery JA, Stratmann RE, Burant JC, Dapprich S, Millam JM, Daniels AD, Kudin KN, Strain MC, Farkas O, Tomasi J, Barone V, Cossi M, Cammi R, Mennucci B, Pomelli C, Adamo C, Clifford S, Ochterski J, Petersson GA, Ayala PY, Cui Q, Morokuma K, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Cioslowski J, Ortiz JV, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Gomperts R, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Gonzalez C, Challacombe M, Gill PMW, Johnson BG, Chen W, Wong MW, Andres JL, Head-Gordon M, Replogle ES, Pople JA (1998) Gaussian 98 (Revision A1). Gaussian Inc., Pittsburgh PA
Onchoke KK, Hadad CM, Dutta PK (2004) Polycycl Aromat Compd 24:37
MORPHY98 – A program written by P.L.A. Popelier with a contribution from R.G.A. Bone. UMIST, Manchester, England
Srinivasan A ALEPH: A learning engine for proposing hypothesis. http://www.web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/aleph.pl.
Page D, Srinivasan A (2003) J Mach Learn Res 4:415
Srinivasan A, King RD, Muggleton SH (1999) The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program. Technical Report PRG-TR-08-99, Oxford Univsersity Computing Laboratory, Oxford
De Raedt L, Kersting K (2004) Lecture Notes in Artificial Intelligence 3244:19
McNemar Q (1947) Psychometrika 12:153
Acknowledgement
This work was supported by The Norwegian Research Council (grant no. 154265/V40).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Buttingsrud, B., Ryeng, E., King, R.D. et al. Representation of molecular structure using quantum topology with inductive logic programming in structure–activity relationships. J Comput Aided Mol Des 20, 361–373 (2006). https://doi.org/10.1007/s10822-006-9058-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-006-9058-y