Skip to main content

Drug Design with Machine Learning

  • Reference work entry
Encyclopedia of Complexity and Systems Science

Definition of the Subject

The process of drug discovery has the goal to identify lead chemicals that have a significant activity against a selected biological target. A disease state may be the result of changes in the structure and function of cell‐signaling receptors, enzymes, hormone receptors, or other functional proteins. The drug target is a protein whose activity is modulated by its interaction with a chemical compound, and thus may control a disease. The lead compounds identified in the drug discovery step are optimized in the drug development phase that results in a small number of chemicals that are evaluated in human clinical trials. The first priority in drug development is to increase the biological activity of a lead compound while preserving its drug‐like properties. The lead compound is expanded into a chemical library that conserves the structure responsible for the biological activity (pharmacophore) and adds chemical groups...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Abbreviations

Bayesian classifier:

Bayes' theorem of conditional probability is a method of statistical inference that represents the basis of several classification machine learning models used in drug design and chemoinformatics to classify libraries of compounds into active and inactive chemicals. A Bayesian classifier considers each structural feature or descriptor independent of the other descriptors, and the probability that a compound is active is proportional to the ratio of active to inactive compounds that have the same structural feature or have the same value for that descriptor. The final probability that a compound is active is computed as the product of all descriptor‐based probabilities. Structural descriptors that are real numbers are usually binned prior to their evaluation with a Bayesian classifier.

Decision tree:

A decision tree is a sequence of rules applied to selected structural descriptors. The training phase comprises the selection of the structural descriptors that are evaluated, the order in which the rules are applied, and the decision taken at each leaf. Usually, each rules evaluates a descriptor (≥ or < than a threshold) and splits the objects into two or more populations. Then each population is selected and the splitting procedure is performed with a new rule, until a stopping condition is met (for example, when all objects in the population belong to the same class). The prediction phase starts from the root node and evaluates each rule on a pathway determined by the outcome (true or false) of the previous rule. When a leaf is reached the algorithm predicts the class of the object (classification trees) or the numerical value of a property (regression trees).

k‑nearest neighbors:

k‑nearest neighbors (k‑NN) is a supervised learning algorithm that predicts the property of an object based on a local interpolation model. In classification, the class of a new object is predicted based on the majority vote of its k nearest neighbors. In regression, the property value for a new object is predicted as an average value of the property values for its k nearest neighbors.

Lazy learning:

Lazy learning is a memory based local learning that defers the computation until a prediction is requested for an object. The first step is to insert the query object into the space of the training objects, and to identify the training objects located in a set neighborhood. The predicted property of the query object is based on an interpolation of the properties of the objects situated in the neighborhood.

Machine learning:

Machine learning is an important field of artificial intelligence, and includes a diversity of methods and algorithms that extract rules and functions from large datasets, such as decision trees, lazy learning, k‑nearest neighbors, Bayesian methods, Gaussian processes, support vector machines, and kernel algorithms. Machine learning algorithms extract information from experimental data by computational and statistical methods and generate a set of rules, functions or procedures that allow them to predict the properties of novel objects that are not included in the learning set.

Quantitative structure‐activity relationships:

Quantitative structure‐activity relationships (QSAR ) represent regression models that define quantitative correlations between the chemical structure of molecules and their physical properties (boiling point, melting point, aqueous solubility), chemical properties and reactivities (chromatographic retention, reaction rate), or biological activities (cell growth inhibition, enzyme inhibition, lethal dose). The fundamental hypotheses of QSAR are that similar chemicals have similar properties, and that small structural changes result in small changes in property values. The general form of a QSAR equation is \( P(i)=f(\mathbf{SD}_{i}) \), where P(i) is a physical, chemical, or biological property of compound i, \( \mathbf{SD}_{i} \) is a vector of structural descriptors of i, and f is a mathematical function such as linear regression, partial least squares, artificial neural networks, or support vector machines. A QSAR model for a property P is based on a dataset of chemical compounds with known values for the property P, and a matrix of structural descriptors computed for all chemicals. The learning (training) of the QSAR model is the process of determining the optimum parameters of the regression function f. After the training phase, a QSAR model may be used to predict the property P for novel compounds that are not present in the learning set of molecules.

Support vector machines:

Support vector machines (SVM) are a class of supervised machine learning methods based on the structural risk minimization and the statistical learning theory of Vapnik. SVM may be applied to data classification and regression, using selected objects (support vectors) to generate the SVM model. Nonlinear classification problems are transformed into linear classification problems by using kernel functions that combine the input space into a higher‐dimensional feature space in which a hyperplane may discriminate the classes. An SVM classification model computes a maximum margin hyperplane that separates the classes in the feature space. The maximal margin hyperplane maximizes the distance to the hyperplane of the closest patterns from the two classes. An SVM regression model builds a regression tube with the property that all objects inside the tube do not contribute to the overall error of the model. The shape of the regression tube is determined by selected objects (support vectors) situated outside the tube.

Structural descriptor:

A structural descriptor (SD) is a numerical value computed from the chemical structure of a molecule, which is invariant to the numbering of the atoms in the molecule. Structural descriptors may be classified as constitutional (counts of molecular fragments, such as rings, functional groups, or atom pairs), topological indices (computed from the molecular graph), geometrical (volume, surface, charged‐surface), quantum (atomic charges, energies of molecular orbitals), and molecular field (such as those used in CoMFA, CoMSIA, or CoRSA).

Structure‐activity relationships:

Structure‐activity relationships (SAR) represent classification models that can discriminate between sets of chemicals that belong to different classes of biological activities, usually active/inactive towards a certain biological receptor. The general form of a SAR equation is \( C(i) = f(\mathbf{SD}_{i}) \), where C(i) is the activity class of compound i (active/inactive, inhibitor/non‐inhibitor, ligand/non‐ligand), \( \mathbf{SD}_{i} \) is a vector of structural descriptors of i, and f is a classification function such as k‑nearest neighbors, linear discriminant analysis, random trees, random forests, Bayesian networks, artificial neural networks, or support vector machines.

Bibliography

  1. Aha DW, Kibler D, Albert MK (1991) Instance‐based learning algorithms. Mach Learn 6:37–66

    Google Scholar 

  2. Ajmani S, Jadhav K, Kulkarni SA (2006) Three‐dimensional QSAR using the k‑nearest neighbor method and its interpretation. J Chem Inf Model 46:24–31

    Google Scholar 

  3. Andres C, Hutter MC (2006) CNS permeability of drugs predicted by a decision tree. QSAR Comb Sci 25:305–309

    Google Scholar 

  4. Alpaydin E (2004) Introduction to machine learning. MIT Press, Cambridge, p 445

    Google Scholar 

  5. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73

    Google Scholar 

  6. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning for control. Artif Intell Rev 11:75–113

    Google Scholar 

  7. Arimoto R, Prasad MA, Gifford EM (2005) Development of CYP3A4 inhibition models: comparisons of machine‐learning techniques and molecular descriptors. J Biomol Screen 10:197–205

    Google Scholar 

  8. Balaban AT, Ivanciuc O (1999) Historical development of topological indices. In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon & Breach Science Publishers, Amsterdam, pp 21–57

    Google Scholar 

  9. Basak SC, Grunwald GD (1995) Molecular similarity and estimation of molecular properties. J Chem Inf Comput Sci 35:366–372

    Google Scholar 

  10. Basak SC, Bertelsen S, Grunwald GD (1994) Application of graph theoretical parameters in quantifying molecular similarity and structure‐activity relationships. J Chem Inf Comput Sci 34:270–276

    Google Scholar 

  11. Basak SC, Bertelsen S, Grunwald GD (1995) Use of graph theoretic parameters in risk assessment of chemicals. Toxicol Lett 79:239–250

    Google Scholar 

  12. Bayes T (1763) An essay towards solving a problem in the doctrine of chances. Philos Trans Roy Soc London 53:370–418

    Google Scholar 

  13. Bender A, Jenkins JL, Glick M, Deng Z, Nettles JH, Davies JW (2006) “Bayes affinity fingerprints” improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? J Chem Inf Model 46:2445–2456

    Google Scholar 

  14. Bender A, Scheiber J, Glick M, Davies JW, Azzaoui K, Hamon J, Urban L, Whitebread S, Jenkins JL (2007) Analysis of pharmacology data and the prediction of adverse drug reactions and off‐target effects from chemical structure. Chem Med Chem 2:861–873

    Google Scholar 

  15. Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin, p 740

    MATH  Google Scholar 

  16. Bishop CM (1996) Neural networks for pattern recognition. Oxford University Press, Oxford, p 504

    MATH  Google Scholar 

  17. Boid DB (2007) How computational chemistry became important in the pharmaceutical industry. In: Lipkowitz KB, Cundari TR (eds) Reviews in computational chemistry, vol 23. Wiley, Weinheim, pp 401–451

    Google Scholar 

  18. Bonchev D (1983) Information theoretic indices for characterization of chemical structure. Research Studies Press, Chichester

    Google Scholar 

  19. Bonchev D, Rouvray DH (eds) (1991) Chemical graph theory. Introduction and fundamentals. Abacus Press/Gordon & Breach Science Publishers, New York

    MATH  Google Scholar 

  20. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Haussler D (ed) Proc of the 5th annual ACM workshop on computational learning theory. ACM Press, Pittsburgh, pp 144–152

    Google Scholar 

  21. Bottou L, Chapelle O, DeCoste D, Weston J (2007) Large‐scale kernel machines. MIT Press, Cambridge, p 416

    Google Scholar 

  22. Breiman L (2001) Random forests. Mach Learn 45:5–32

    MATH  Google Scholar 

  23. Briem H, Günther J (2005) Classifying “kinase inhibitor‐likeness” by using machine‐learning methods. Chem Bio Chem 6:558–566

    Google Scholar 

  24. Cash GG (1999) Prediction of physicochemical properties from Euclidean distance methods based on electrotopological state indices. Chemosphere 39:2583–2591

    Google Scholar 

  25. Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram‐based image classification. IEEE Trans Neural Netw 10:1055–1064

    Google Scholar 

  26. Cleary JG, Trigg LE (1995) K : an instance‐based learner using and entropic distance measure. In: Prieditis A, Russell SJ (eds) Proc of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114

    Google Scholar 

  27. Cohen WW (1995) Fast effective rule induction. In: Prieditis A, Russell SJ (eds) Proc of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 115–123

    Google Scholar 

  28. Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  29. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    Google Scholar 

  30. DeconinckE, Zhang MH, Coomans D, Vander Heyden Y (2006) Classification treemodels for the prediction of blood-brain barrier passage ofdrugs. J Chem Inf Model 46:1410–1419

    Google Scholar 

  31. Deng Z, Chuaqui C, Singh J (2006) Knowledge‐based design of target‐focused libraries using protein‐ligand interaction constraints. J Med Chem 49:490–500

    Google Scholar 

  32. Doddareddy MR, Cho YS, Koh HY, Kim DH, Pae AN (2006) In silico renal clearance model using classical Volsurf approach. J Chem Inf Model 46:1312–1320

    Google Scholar 

  33. Drucker H, Wu DH, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10:1048–1054

    Google Scholar 

  34. DuH, Wang J, Watzl J, Zhang X, Hu Z (2008) Classificationstructure‐activity relationship (CSAR) studies forprediction ofgenotoxicity of thiophene derivatives. Toxicol Lett177:10–19

    Google Scholar 

  35. Duda RO, Hart PE, Stork DG (2000) Pattern classification. 2nd edn. Wiley, New York

    Google Scholar 

  36. Ehrman TM, Barlow DJ, Hylands PJ (2007) Virtual screening of chinese herbs with random forest. J Chem Inf Model 47:264–278

    Google Scholar 

  37. Eitrich T, Kless A, Druska C, Meyer W, Grotendorst J (2007) Classification of highly unbalanced CYP450 data of drugs using cost sensitive machine learning techniques. J Chem Inf Model 47:92–103

    Google Scholar 

  38. Ekins S, Balakin KV, Savchuk N, Ivanenkov Y (2006) Insights for human ether-a-go-go-related gene potassium channel inhibition using recursive partitioning and Kohonen and Sammon mapping techniques. J Med Chem 49:5059–5071

    Google Scholar 

  39. Ertl P, Roggo S, Schuffenhauer A (2008) Natural product‐likeness score and its application for prioritization of compound libraries. J Chem Inf Model 48:68–74

    Google Scholar 

  40. Fatemi MH, Gharaghani S (2007) A novel QSAR model for prediction of apoptosis‐inducing activity of 4-aryl-4-H‑chromenes based on support vector machine. Bioorg Med Chem 15:7746–7754

    Google Scholar 

  41. Frank E, Hall M, Trigg L, Holmes G, Witten IH (2004) Data mining in bioinformatics using Weka. Bioinformatics 20:2479–2481

    Google Scholar 

  42. Freund Y, Mason L (1999) The alternating decision tree learning algorithm. In: Bratko I, Dzeroski S (eds) Proc of the 16th international conference on machine learning (ICML (1999)). Morgan Kaufmann, Bled, pp 124–133

    Google Scholar 

  43. Gaines BR, Compton P (1995) Induction of ripple‐down rules applied to modeling large databases. Intell J Inf Syst 5:211–228

    Google Scholar 

  44. Gao JB, Gunn SR, Harris CJ (2003) SVM regression through variational methods and its sequential implementation. Neurocomputing 55:151–167

    Google Scholar 

  45. Gao JB, Gunn SR, Harris CJ (2003) Mean field method for the support vector machine regression. Neurocomputing 50:391–405

    MATH  Google Scholar 

  46. Gepp MM, Hutter MC (2006) Determination of hERG channel blockers using a decision tree. Bioorg Med Chem 14:5325–5332

    Google Scholar 

  47. Guha R, Dutta D, Jurs PC, Chen T (2006) Local lazy regression: making use of the neighborhood to improve QSAR predictions. J Chem Inf Model 46:1836–1847

    Google Scholar 

  48. Gute BD, Basak SC (2001) Molecular similarity‐based estimation of properties: a comparison of three structure spaces. J Mol Graph Modell 20:95–109

    Google Scholar 

  49. Gute BD, Basak SC, Mills D, Hawkins DM (2002) Tailored similarity spaces for the prediction of physicochemical properties. Internet Electron J Mol Des 1:374–387

    Google Scholar 

  50. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422

    MATH  Google Scholar 

  51. Hansch C, Garg R, Kurup A, Mekapati SB (2003) Allosteric interactions and QSAR: on the role of ligand hydrophobicity. Bioorg Med Chem 11:2075–2084

    Google Scholar 

  52. Hastie T, Tibshirani R, Friedman JH (2003) The elements of statistical learning. Springer, Berlin, p 552

    Google Scholar 

  53. Herbrich R (2002) Learning kernel classifiers. MIT Press, Cambridge

    Google Scholar 

  54. Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2006) New methods for ligand‐based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 46:462–470

    Google Scholar 

  55. Hoffman B, Cho SJ, Zheng W, Wyrick S, Nichols DE, Mailman RB, Tropsha A (1999) Quantitative structure‐activity relationship modeling of dopamine \( {\text{D}}_{1} \) antagonists using comparative molecular field analysis, genetic algorithms‐partial least‐squares, and K‑nearest neighbor methods. J Med Chem 42:3217–3226

    Google Scholar 

  56. HolteRC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn11:63–90

    Google Scholar 

  57. Hou T, Wang J, Zhang W, Xu X (2007) ADME evaluation in drug discovery. 7. Prediction of oral absorption by correlation and classification. J Chem Inf Model 47:208–218

    Google Scholar 

  58. Huang T-M, Kecman V, Kopriva I (2006) Kernel based algorithms for mining huge data sets. Springer, Berlin, p 260

    MATH  Google Scholar 

  59. Hudelson MG, Ketkar NS, Holder LB, Carlson TJ, Peng C-C, Waldher BJ, Jones JP (2008) High confidence predictions of drug-drug interactions: predicting affinities for cytochrome P450 2C9 with multiple computational methods. J Med Chem 51:648–654

    Google Scholar 

  60. Itskowitz P, Tropsha A (2005) k‑nearest neighbors QSAR modeling as a variational problem: theory and applications. J Chem Inf Model 45:777–785

    Google Scholar 

  61. Ivanciuc O (2002) Support vector machine classification of the carcinogenic activity of polycyclic aromatic hydrocarbons. Internet Electron J Mol Des 1:203–218

    Google Scholar 

  62. Ivanciuc O (2002) Structure‐odor relationships for pyrazines with support vector machines. Internet Electron J Mol Des 1:269–284

    Google Scholar 

  63. Ivanciuc O (2002) Support vector machine identification of the aquatic toxicity mechanism of organic compounds. Internet Electron J Mol Des 1:157–172

    Google Scholar 

  64. Ivanciuc O (2003) Graph theory in chemistry. In: Gasteiger J (ed) Handbook of chemoinformatics, vol 1. Wiley, Weinheim, pp 103–138

    Google Scholar 

  65. Ivanciuc O (2003) Topological indices. In: Gasteiger J (ed) Handbook of chemoinformatics, vol 3. Wiley, Weinheim, pp 981–1003

    Google Scholar 

  66. Ivanciuc O (2003) Aquatic toxicity prediction for polar and nonpolar narcotic pollutants with support vector machines. Internet Electron J Mol Des 2:195–208

    Google Scholar 

  67. Ivanciuc O (2004) Support vector machines prediction of the mechanism of toxic action from hydrophobicity and experimental toxicity against pimephales promelas and tetrahymena pyriformis. Internet Electron J Mol Des 3:802–821

    Google Scholar 

  68. Ivanciuc O (2005) Support vector regression quantitative structure‐activity relationships (QSAR) for benzodiazepine receptor ligands. Internet Electron J Mol Des 4:181–193

    Google Scholar 

  69. Ivanciuc O (2005) Machine learning applied to anticancer structure‐activity relationships for NCI human tumor cell lines. Internet Electron J Mol Des 4:948–958

    Google Scholar 

  70. Ivanciuc O (2007) Applications of support vector machines in chemistry. In: Lipkowitz KB, Cundari TR (eds) Reviews in computational chemistry, vol 23. Wiley, Weinheim, pp 291–400

    Google Scholar 

  71. John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Besnard P, Hanks S (eds) UAI '95: Proc of the 11th annual conference on uncertainty in artificial intelligence. Morgan Kaufmann, Montreal, pp 338–345

    Google Scholar 

  72. Jorissen RN, Gilson MK (2005) Virtual screening of molecular databases using a support vector machine. J Chem Inf Model 45:549–561

    Google Scholar 

  73. Jurs P (2003) Quantitative structure‐property relationships. In: Gasteiger J (ed) Handbook of chemoinformatics, vol 3. Wiley, Weinheim, pp 1314–1335

    Google Scholar 

  74. Kier LB, Hall LH (1976) Molecular connectivity in chemistry and drug research. Academic Press, New York

    Google Scholar 

  75. Kier LB, Hall LH (1986) Molecular connectivity in structure‐activity analysis. Research Studies Press, Letchworth

    Google Scholar 

  76. Kier LB, Hall LH (1999) Molecular structure description. The electrotopological state. Academic Press, San Diego

    Google Scholar 

  77. Klon AE, Diller DJ (2007) Library fingerprints: a novel approach to the screening of virtual libraries. J Chem Inf Model 47:1354–1365

    Google Scholar 

  78. Klon AE, Glick M, Davies JW (2004) Combination of a naive Bayes classifier with consensus scoring improves enrichment of high‐throughput docking results. J Med Chem 47:4356–4359

    Google Scholar 

  79. Klon AE, Glick M, Thoma M, Acklin P, Davies JW (2004) Finding more needles in the haystack: a simple and efficient method for improving high‐throughput docking results. J Med Chem 47:2743–2749

    Google Scholar 

  80. Klon AE, Lowrie JF, Diller DJ (2006) Improved naïve Bayesian modeling of numerical data for absorption, distribution, metabolism and excretion (ADME) property prediction. J Chem Inf Model 46:1945–1956

    Google Scholar 

  81. Kohavi R (1995) The power of decision tables. In: Lavrac N, Wrobel S (eds) ECML-95 8th european conference on machine learning. Lecture Notes in Computer Science, vol 912. Springer, Heraclion, pp 174–189

    Google Scholar 

  82. Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision‐tree hybrid. In: Simoudis E, Han J, Fayyad UM (eds) Proc of the 2nd international conference on knowledge discovery and data mining (KDD-96). AAAI Press, Menlo Park, pp 202–207

    Google Scholar 

  83. Kononenko I, Kukar M (2007) Machine learning and data mining: introduction to principles and algorithms. Horwood, Westergate, p 454

    Google Scholar 

  84. Konovalov DA, Coomans D, Deconinck E, Vander Heyden Y (2007) Benchmarking of QSAR models for blood‐brain barrier permeation. J Chem Inf Model 47:1648–1656

    Google Scholar 

  85. Kumar R, Kulkarni A, Jayaraman VK, Kulkarni BD (2004) Structure‐activity relationships using locally linear embedding assisted by support vector and lazy learning regressors. Internet Electron J Mol Des 3:118–133

    Google Scholar 

  86. le Cessie S, van Houwelingen JC (1992) Ridge estimators in logistic regression. Appl Statist 41:191–201

    MATH  Google Scholar 

  87. Leong MK (2007) A novel approach using pharmacophore ensemble/support vector machine (PhE/SVM) for prediction of hERG liability. Chem Res Toxicol 20:217–226

    Google Scholar 

  88. Lepp Z, Kinoshita T, Chuman H (2006) Screening for new antidepressant leads of multiple activities by support vector machines. J Chem Inf Model 46:158–167

    Google Scholar 

  89. LiH, Yap CW, Ung CY, Xue Y, Cao ZW, Chen YZ (2005) Effect of selectionof molecular descriptors on the prediction of blood‐brain barrier penetrating and nonpenetrating agents by statistical learning methods. J Chem Inf Model 45:1376–1384

    Google Scholar 

  90. Li S, Fedorowicz A, Singh H, Soderholm SC (2005) Application of the random forest method in studies of local lymph node assay based skin sensitization data. J Chem Inf Model 45:952–964

    Google Scholar 

  91. Li W-X, Li L, Eksterowicz J, Ling XB, Cardozo M (2007) Significance analysis and multiple pharmacophore models for differentiating P‑glycoprotein substrates. J Chem Inf Model 47:2429–2438

    Google Scholar 

  92. Liao Q, Yao J, Yuan S (2007) Prediction of mutagenic toxicity by combination of recursive partitioning and support vector machines. Mol Divers 11:59–72

    Google Scholar 

  93. Mangasarian OL, Musicant DR (2000) Robust linear and support vector regression. IEEE Trans Pattern Anal Mach Intell 22:950–955

    Google Scholar 

  94. Mangasarian OL, Musicant DR (2002) Large scale kernel regression via linear programming. Mach Learn 46:255–269

    MATH  Google Scholar 

  95. Merkwirth C, Mauser HA, Schulz-Gasch T, Roche O, Stahl M, Lengauer T (2004) Ensemble methods for classification in cheminformatics. J Chem Inf Comput Sci 44:1971–1978

    Google Scholar 

  96. Mitchell TM (1997) Machine learning. McGraw-Hill, Maidenhead, p 432

    MATH  Google Scholar 

  97. Müller K-R, Rätsch G, Sonnenburg S, Mika S, Grimm M, Heinrich N (2005) Classifying ‘drug‐likeness’ with kernel‐based learning methods. J Chem Inf Model 45:249–253

    Google Scholar 

  98. Neugebauer A, Hartmann RW, Klein CD (2007) Prediction of protein‐protein interaction inhibitors by chemoinformatics and machine learning methods. J Med Chem 50:4665–4668

    Google Scholar 

  99. Neumann D, Kohlbacher O, Merkwirth C, Lengauer T (2006) A fully computational model for predicting percutaneous drug absorption. J Chem Inf Model 46:424–429

    Google Scholar 

  100. Nidhi, Glick M, Davies JW, Jenkins JL (2006) Prediction of biological targets for compounds using multiple‐category Bayesian models trained on chemogenomics databases. J Chem Inf Model 46:1124–1133

    Google Scholar 

  101. Nigsch F, Bender A, van Buuren B, Tissen J, Nigsch E, Mitchell JBO (2006) Melting point prediction employing k‑nearest neighbor algorithms and genetic parameter optimization. J Chem Inf Model 46:2412–2422

    Google Scholar 

  102. Oloff S, Muegge I (2007) kScore: a novel machine learning approach that is not dependent on the data structure of the training set. J Comput-Aided Mol Des 21:87–95

    ADS  Google Scholar 

  103. Oloff S, Zhang S, Sukumar N, Breneman C, Tropsha A (2006) Chemometric analysis of ligand receptor complementarity: Identifying complementary ligands based on receptor information (CoLiBRI). J Chem Inf Model 46:844–851

    Google Scholar 

  104. Palmer DS, O'Boyle NM, Glen RC, Mitchell JBO (2007) Random forest models to predict aqueous solubility. J Chem Inf Model 47:150–158

    Google Scholar 

  105. Pelletier DJ, Gehlhaar D, Tilloy-Ellul A, Johnson TO, Greene N (2007) Evaluation of a published in silico model and construction of a novel Bayesian model for predicting phospholipidosis inducing potential. J Chem Inf Model 47:1196–1205

    Google Scholar 

  106. Platt J (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods – support vector learning. MIT Press, Cambridge, pp 185–208

    Google Scholar 

  107. Plewczynski D, Spieser SAH, Koch U (2006) Assessing different classification methods for virtual screening. J Chem Inf Model 46:1098–1106

    Google Scholar 

  108. Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  109. Ren S (2002) Classifying class I and class II compounds by hydrophobicity and hydrogen bonding descriptors. Environ Toxicol 17:415–423

    Google Scholar 

  110. Ripley BD (2008) Pattern recognition and neural networks. Cambridge University Press, Cambridge, p 416

    MATH  Google Scholar 

  111. Rodgers S, Glen RC, Bender A (2006) Characterizing bitterness: identification of key structural features and development of a classification model. J Chem Inf Model 46:569–576

    Google Scholar 

  112. Rusinko A, Farmen MW, Lambert CG, Brown PL, Young SS (1999) Analysis of a large structure/biological activity data set using recursive partitioning. J Chem Inf Comput Sci 39:1017–1026

    Google Scholar 

  113. Sakiyama Y, Yuki H, Moriya T, Hattori K, Suzuki M, Shimada K, Honma T (2008) Predicting human liver microsomal stability with machine learning techniques. J Mol Graph Modell 26:907–915

    Google Scholar 

  114. Schneider N, Jäckels C, Andres C, Hutter MC (2008) Gradual in silico filtering for druglike substances. J Chem Inf Model 48:613–628

    Google Scholar 

  115. Schölkopf B, Smola AJ (2002) Learning with kernels. MIT Press, Cambridge

    Google Scholar 

  116. Schölkopf B, Sung KK, Burges CJC, Girosi F, Niyogi P, Poggio T, Vapnik V (1997) Comparing support vector machines with gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process 45:2758–2765

    Google Scholar 

  117. Schölkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods: support vector learning. MIT Press, Cambridge

    Google Scholar 

  118. Schroeter TS, Schwaighofer A, Mika S, ter Laak A, Suelzle D, Ganzer U, Heinrich N, Müller K-R (2007) Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules. J Comput-Aided Mol Des 21:485–498

    Google Scholar 

  119. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Google Scholar 

  120. ShenM, LeTiran A, Xiao Y, Golbraikh A, Kohn H, Tropsha A(2002) Quantitative structure‐activity relationship analysis offunctionalized amino acid anticonvulsant agents using k‑nearest neighbor and simulated annealing PLS methods. J Med Chem 45:2811–2823

    Google Scholar 

  121. Shen M, Xiao Y, Golbraikh A, Gombar VK, Tropsha A (2003) Development and validation of k‑nearest‐neighbor QSPR models of metabolic stability of drug candidates. J Med Chem 46:3013–3020

    Google Scholar 

  122. Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14:199–222

    Google Scholar 

  123. Sommer S, Kramer S (2007) Three data mining techniques to improve lazy structure‐activity relationships for noncongeneric compounds. J Chem Inf Model 47:2035–2043

    Google Scholar 

  124. Sorich MJ, McKinnon RA, Miners JO, Smith PA (2006) The importance of local chemical structure for chemical metabolism by human uridine 5'‑diphosphate‐glucuronosyltransferase. J Chem Inf Model 46:2692–2697

    Google Scholar 

  125. Sun H (2005) A naive Bayes classifier for prediction of multidrug resistance reversal activity on the basis of atom typing. J Med Chem 48:4031–4039

    Google Scholar 

  126. Suykens JAK (2001) Support vector machines: a nonlinear modelling and control perspective. Eur J Control 7:311–327

    Google Scholar 

  127. Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore

    MATH  Google Scholar 

  128. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958

    Google Scholar 

  129. Svetnik V, Wang T, Tong C, A Liaw, Sheridan RP, Song Q (2005) Boosting: an ensemble learning tool for compound classification and QSAR modeling. J Chem Inf Model 45:786–799

    Google Scholar 

  130. Swamidass SJ, Chen J, Phung P, Ralaivola L, Baldi P (2005) Kernels for small molecules and the prediction of mutagenicity, toxicity and anti‐cancer activity. Bioinformatics 21[S1]:i359–i368

    Google Scholar 

  131. Terfloth L, Bienfait B, Gasteiger J (2007) Ligand‐based models for the isoform specificity of cytochrome P450 3A4, 2D6, and 2C9 substrates. J Chem Inf Model 47:1688–1701

    Google Scholar 

  132. Tobita M, Nishikawa T, Nagashima R (2005) A discriminant model constructed by the support vector machine method for HERG potassium channel inhibitors. Bioorg Med Chem Lett 15:2886–2890

    Google Scholar 

  133. Todeschini R, Consonni V (2003) Descriptors from molecular geometry. In: Gasteiger J (ed) Handbook of chemoinformatics, vol 3. Wiley, Weinheim, pp 1004–1033

    Google Scholar 

  134. Tong W, Hong H, Fang H, Xie Q, Perkins R (2003) Decision forest: Combining the predictions of multiple independent decision tree models. J Chem Inf Comput Sci 43:525–531

    Google Scholar 

  135. Tong W, Xie Q, Hong H, Shi L, Fang H, Perkins R (2004) Assessment of prediction confidence and domain extrapolation of two structure‐activity relationship models for predicting estrogen receptor binding activity. Env Health Perspect 112:1249–1254

    Google Scholar 

  136. Trinajstić N (1992) Chemical graph theory. CRC Press, Boca Raton

    Google Scholar 

  137. Urrestarazu Ramos E, Vaes WHJ, Verhaar HJM, Hermens JLM (1998) Quantitative structure‐activity relationships for the aquatic toxicity of polar and nonpolar narcotic pollutants. J Chem Inf Comput Sci 38:845–852

    Google Scholar 

  138. Vapnik VN (1979) Estimation of dependencies based on empirical data. Nauka, Moscow

    Google Scholar 

  139. Vapnik VN (1995) The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  140. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  141. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10:988–999

    Google Scholar 

  142. Vapnik V, Chapelle O (2000) Bounds on error expectation for support vector machines. Neural Comput 12:2013–2036

    Google Scholar 

  143. Vapnik VN, Chervonenkis AY (1974) Theory of pattern recognition. Nauka, Moscow

    MATH  Google Scholar 

  144. Vapnik V, Lerner A (1963) Pattern recognition using generalized portrait method. Automat Remote Control 24:774–780

    Google Scholar 

  145. Varnek A, Kireeva N, Tetko IV, Baskin II, Solov'ev VP (2007) Exhaustive QSPR studies of a large diverse set of ionic liquids: how accurately can we predict melting points? J Chem Inf Model 47:1111–1122

    Google Scholar 

  146. Vogt M, Bajorath J (2008) Bayesian similarity searching in high‐dimensional descriptor spaces combined with Kullback–Leibler descriptor divergence analysis. J Chem Inf Model 48:247–255

    Google Scholar 

  147. von Korff M, Sander T (2006) Toxicity‐indicating structural patterns. J Chem Inf Model 46:536–544

    Google Scholar 

  148. Votano JR, Parham M, Hall LM, Hall LH, Kier LB, Oloff S, Tropsha A (2006) QSAR modeling of human serum protein binding with several modeling techniques utilizing structure‐information representation. J Med Chem 49:7169–7181

    Google Scholar 

  149. Wang J, Du H, Yao X, Hu Z (2007) Using classification structure pharmacokinetic relationship (SCPR) method to predict drug bioavailability based on grid‐search support vector machine. Anal Chim Acta 601:156–163

    Google Scholar 

  150. Watson P (2008) Naïve Bayes classification using 2D pharmacophore feature triplet vectors. J Chem Inf Model 48:166–178

    Google Scholar 

  151. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, p 525

    Google Scholar 

  152. Xiao Z, Xiao Y-D, Feng J, Golbraikh A, Tropsha A, Lee K-H (2002) Antitumor agents. 213. Modeling of epipodophyllotoxin derivatives using variable selection k‑nearest neighbor QSAR method. J Med Chem 45:2294–2309

    Google Scholar 

  153. Xue Y, Li ZR, Yap CW, Sun LZ, Chen X, Chen YZ (2004) Effect of molecular descriptor feature selection in support vector machine classification of pharmacokinetic and toxicological properties of chemical agents. J Chem Inf Comput Sci 44:1630–1638

    Google Scholar 

  154. Yamashita F, Hara H, Ito T, Hashida M (2008) Novel hierarchical classification and visualization method for multiobjective optimization of drug properties: application to structure‐activity relationship analysis of cytochrome P450 metabolism. J Chem Inf Model 48:364–369

    Google Scholar 

  155. Yap CW, Chen YZ (2005) Prediction of cytochrome P450 3A4, 2D6, and 2C9 inhibitors and substrates by using support vector machines. J Chem Inf Model 45:982–992

    Google Scholar 

  156. Yap CW, Cai CZ, Xue Y, Chen YZ (2004) Prediction of torsade‐causing potential of drugs by support vector machine approach. Toxicol Sci 79:170–177

    Google Scholar 

  157. Yu G-X, Park B-H, Chandramohan P, Munavalli R, Geist A, Samatova NF (2005) In silico discovery of enzyme‐substrate specificity‐determining residue clusters. J Mol Biol 352:1105–1117

    Google Scholar 

  158. Yue P, Li Z, Moult J (2005) Loss of protein structure stability as a major causative factor in monogenic disease. J Mol Biol 353:459–473

    Google Scholar 

  159. Zhang S, Golbraikh A, Oloff S, Kohn H, Tropsha A (2006) A novel automated lazy learning QSAR (ALL-QSAR) approach: method development, applications, and virtual screening of chemical databases using validated ALL-QSAR models. J Chem Inf Model 46:1984–1995

    Google Scholar 

  160. Zhang S, Golbraikh A, Tropsha A (2006) Development of quantitative structure‐binding affinity relationship models based on novel geometrical chemical descriptors of the protein‐ligand interfaces. J Med Chem 49:2713–2724

    Google Scholar 

  161. Zheng WF, Tropsha A (2000) Novel variable selection quantitative structure‐property relationship approach based on the k‑nearest‐neighbor principle. J Chem Inf Comput Sci 40:185–194

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag

About this entry

Cite this entry

Ivanciuc, O. (2009). Drug Design with Machine Learning. In: Meyers, R. (eds) Encyclopedia of Complexity and Systems Science. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30440-3_135

Download citation

Publish with us

Policies and ethics