Abstract
In order to develop robust machine-learning or statistical models for predicting biological activity, descriptors that capture the essence of the protein–ligand interaction are required. In the absence of structural information from X-ray or NMR experiments, deriving informative descriptors can be difficult. We have developed feature-map vectors (FMVs), a new class of descriptors based on chemical features, to address this challenge. FMVs, which are derived from the conformational models of a few actives, are low dimensional, problem specific, and highly interpretable. By using shape-based alignments and scoring with chemical features, FMVs can combine information about a molecule’s shape and the pharmacophores it can match. In five validation studies, bag classifiers built using FMVs have shown high enrichments for identifying actives for five diverse targets: CDK2, 5-HT3, DHFR, thrombin, and ACE. The interpretability of these descriptors has been demonstrated for CDK2 and 5-HT3, where the method automatically discovers the standard literature pharmacophore.










Similar content being viewed by others
References
Cramer RD 3rd, Patterson DE, Bunce JD (1988) J Am Chem Soc 110:5959
Klebe G, Abraham U, Mietzner T (1994) J Med Chem 37:4130
Guner O (ed) (2000) Pharmacophore perception, development, and use in drug design. International University Line, La Jolla
Eksterowicz JE, Evensen E, Lemmen C, Brady GP, Lanctot JK, Bradley EK, Saiah E, Robinson LA, Grootenhuis PD, Blaney JM (2002) J Mol Graph Model 20:469
Renner S, Schneider G (2004) J Med Chem 47:4653
Putta S, Lemmen C, Beroza P, Greene J (2002) J Chem Inf Comput Sci 42:1230
Greene J, Kahn S, Svoj H, Sprague P, Teig S (1994) J Chem Inf Comput 34:1297
MOE , Molecular Operating Environment, Chemical Computing Group
Lemmen C, Lengauer T, Klebe G (1998) J Med Chem 41:4502
Putta S, Eksterowicz J, Lemmen C, Stanton R (2003) J Chem Inf Comput Sci 43:1623
Warren G, Webster Andrews C, Capelli A-M, Clarke B, LaLonde J, Lambert M, Lindvall M, Nevins N, Semus S, Senger S, Tedesco G, Wall I, Woolven J, Peishoof C, Head M (2005) J Med Chem ASAP http://dxdoiorg/101021/jm050362n
Klon AE, Glick M, Davies JW (2004) J Chem Inf Comput Sci 44:2216
Klon AE, Glick M, Davies JW (2004) J Med Chem 47:4356
Klon AE, Glick M, Thoma M, Acklin P, Davies JW (2004) J Med Chem 47:2743
Springer C, Adalsteinsson H, Young MM, Kegelmeyer PW, Roe DC (2005) J Med Chem 48:6821
Putta S, Landrum GA, Penzotti JE (2005) J Med Chem 48:3313
Smellie A, Stanton R, Henne R, Teig S (2003) J Comput Chem 24:10
Webb A (2002) Statistical pattern recognition. John Wiley & Sons, Hoboken
Mitchell T (1997) Machine learning. McGraw-Hill, New York
Dietterich TG (1997) AI Mag 18:97
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) J Chem Inf Comput Sci 43:1947
Landrum GA, Penzotti JE, Putta S (2004) Mat Res Soc Symp Proc 804:JJ115
Breiman L (1996) Machine Learning 24:123
Fayyad UM, Irani KB (1992) Machine Learning 8:87
Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. 13th International Joint Conference on Artificial Intelligence, Morgan-Kaufmann, pp 1022–1027
Out-of-Bag Estimation, UC Berkeley Department of Statistics, ftp://ftpstatberkeleyedu/pub/users/breiman/OOBestimationpsZ
Bender A, Glen RC (2005) J Chem Inf Model 45:1369
Norbury C, Nurse P (1992) Annu Rev Biochem 61:441
Sherr CJ (1996) Science 274:1672
Davis ST, Benson BG, Bramson HN, Chapman DE, Dickerson SH, Dold KM, Eberwein DJ, Edelstein M, Frye SV, Gampe RT Jr, Griffin RJ, Harris PA, Hassell AM, Holmes WD, Hunter RN, Knick VB, Lackey K, Lovejoy B, Luzzio MJ, Murray D, Parker P, Rocque WJ, Shewchuk L, Veal JM, Walker DH, Kuyper LF (2001) Science 291:134
Anderson M, Beattie JF, Breault GA, Breed J, Byth KF, Culshaw JD, Ellston RP, Green S, Minshull CA, Norman RA, Pauptit RA, Stanway J, Thomas AP, Jewsbury PJ (2003) Bioorg Med Chem Lett 13:3021
Beattie JF, Breault GA, Ellston RP, Green S, Jewsbury PJ, Midgley CJ, Naven RT, Minshull CA, Pauptit RA, Tucker JA, Pease JE (2003) Bioorg Med Chem Lett 13:2955
Bradley EK, Miller JL, Saiah E, Grootenhuis PD (2003) J Med Chem 46:4360
Landrum GA, Penzotti JE, Putta S (2004) eChemInfo 2004
Bramson HN, Corona J, Davis ST, Dickerson SH, Edelstein M, Frye SV, Gampe RT Jr, Harris PA, Hassell A, Holmes WD, Hunter RN, Lackey KE, Lovejoy B, Luzzio MJ, Montana V, Rocque WJ, Rusnak D, Shewchuk L, Veal JM, Walker DH, Kuyper LF (2001) J Med Chem 44:4339
Breault GA, Ellston RP, Green S, James SR, Jewsbury PJ, Midgley CJ, Pauptit RA, Minshull CA, Tucker JA, Pease JE (2003) Bioorg Med Chem Lett 13:2961
Gozlan H, In Olivier B, van Wijngaarden I, Soudijn W (eds) (1997) Serotonin receptors and their ligands, Elsevier, Amsterdam
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) J Chem Inf Comput Sci 44:1177
Bionet Screening Compounds Database, Key Organics Limited, http://wwwkeyorganicsltduk/screeninhtm
Clark RD, Miller AB, Berger J, Repke DB, Weinhardt KK, Kowalczyk BA, Eglen RM, Bonhaus DW, Lee CH, Michel AD et al (1993) J Med Chem 36:2645
Hibert MF, Hoffmann R, Miller RC, Carr AA (1990) J Med Chem 33:1594
de Gasparo M, Catt KJ, Inagami T, Wright JW, Unger T (2000) Pharmacol Rev 52:415
Hagaman JR, Moyer JS, Bachman ES, Sibony M, Magyar PL, Welch JE, Smithies O, Krege JH, O’Brien DA (1998) Proc Natl Acad Sci USA 95:2552
Kessler SP, deS Senanayake P, Scheidemantel TS, Gomos JB, Rowe TM, Sen GC (2003) J Biol Chem 278:21105
Fink C (1996) Exp Opin Ther Pat 6:1147
Sutherland JJ, O’Brien LA, Weaver DF (2003) J Chem Inf Comput Sci 43:1906
Cody V, Galitsky N, Luft JR, Pangborn W, Blakley RL, Gangjee A (1998) Anticancer Drug Des 13:307
Cody V, Luft JR, Pangborn W, Gangjee A, Queener SF (2004) Acta Crystallogr D Biol Crystallogr 60:646–55
Klon AE, Heroux A, Ross LJ, Pathak V, Johnson CA, Piper JR, Borhani DW (2002) J Mol Biol 320:677–93
Stahl M, Rarey M, Klebe G (2001) In: Lengauer T (ed) Bioinformatics: from genomes to drugs, VCH, Weinheim, pp 137–170
Brandstetter H, Turk D, Hoeffken HW, Grosse D, Sturzebecher J, Martin PD, Edwards BF, Bode W (1992) J Mol Biol 226:1085
St Charles R, Matthews JH, Zhang E, Tulinsky A (1999) J Med Chem 42:1376
Engh RA, Brandstetter H, Sucher G, Eichinger A, Baumann U, Bode W, Huber R, Poll T, Rudolph R, von der Saal W (1996) Structure 4:1353
Acknowledgements
The authors would like to thank Erin Bradley (Sunesis Inc.) for providing the aligned CDK2 crystal structures and Christian Lemmen (BioSolveIT Gmbh) for providing the ACE and thrombin datasets.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Landrum, G.A., Penzotti, J.E. & Putta, S. Feature-map vectors: a new class of informative descriptors for computational drug discovery. J Comput Aided Mol Des 20, 751–762 (2006). https://doi.org/10.1007/s10822-006-9085-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-006-9085-8