Skip to main content
Log in

QSAR modeling based on the bias/variance compromise: a harmonious

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Modeling quantitative structure–activity relationships (QSAR) is considered with an emphasis on prediction. An abundance of methods are available to develop such models. Using a harmonious approach that balances the bias and variance of predictions, the best calibration models are identified relative to the bias and variance criteria used. Criteria utilized to determine the adequacy of models are the root mean square error of calibration (RMSEC) and validation (RMSEV), respective R 2 values, and the norm of the regression vector. QSAR data from the literature are used to demonstrate concepts. For these data sets and criteria used, it is suggested that models obtained by ridge regression (RR) are more harmonious and parsimonious than models obtained by partial least squares (PLS) and principal component regression (PCR) when the data is mean-centered. The most harmonious RR models have the best bias/variance tradeoff reflected by the smallest RMSEC, RMSEV, and regression vector norms and the largest calibration and validation R 2 values. The most parsimonious RR models have the smallest effective rank.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • H. van de Waterbeemd (Eds) (1995) Chemometric Methods in Molecular Design VCH New York

    Google Scholar 

  • B.G.M. Vandeginste D.L. Massart L.M.C. Buydens S. De Jong P.J. Lewi J. Smeyers-Verbeke (1998) Handbook of Chemometrics and Qualimetrics: Part B, Chapter 37 Elsevier Amsterdam

    Google Scholar 

  • J.H. Kalivas P.M. Lang (1994) Mathematical Analysis of Spectral Orthogonality Marcel Dekker New York

    Google Scholar 

  • J.H. Kalivas (1999) Chemom. 13 111

    Google Scholar 

  • J.H. Kalivas (1999) Chemom. Intell. Lab. Syst., 45 215

    Google Scholar 

  • M. Goldstein A.F.M. Smith (1974) J. Royal Stat. Soc B, 36 284

    Google Scholar 

  • R.F. Gunst R.L. Mason (1977) J. Am. Stat. Assoc., 72 616

    Google Scholar 

  • P.C. Hansen (1988) Computing 40 185

    Google Scholar 

  • J.M. Lowerre (1974) Technometrics 16 461

    Google Scholar 

  • C. Bingham K. Larntz (1977) J. Am. Stat. Assoc. 72 97

    Google Scholar 

  • R.R. Hocking F.M. Speed M.J. Lynn (1976) Technometrics 18 425

    Google Scholar 

  • S. de Jong B.M. Wise N.L. Ricker (2001) J. Chemom., 15 85

    Google Scholar 

  • S. de Jong H.A.L. Kiers (1992) Chemo. Intell. Lab. Syst., 14 155

    Google Scholar 

  • M. Aldrin (2000) Am. Stat. Assoc., 54 29

    Google Scholar 

  • T.R. Holcomb H. Hjakmarsson M. Morari M.L. Tyler (1997) J. Chemom., 11 282

    Google Scholar 

  • J.H. Kalivas (2001) Anal. Chim. Acta, 428 31

    Google Scholar 

  • Q.S. Xu Y.Z. Liang H.L. Shen (2001) J. Chemom., 15 135

    Google Scholar 

  • R.L. Green J.H. Kalivas (2002) Chemom. Intell. Lab. Syst., 60 173

    Google Scholar 

  • J.H. Kalivas R.L. Green (2001) Appl. Spectrosc., 55 1645

    Google Scholar 

  • J.H. Kalivas (2004) Anal. Chim. Acta, 505 9

    Google Scholar 

  • K.J. Anderson J.H. Kalivas (2003) Appl. Spectrosc., 57 309

    Google Scholar 

  • J.L. Cohon (1978) Multiobjective Programming and Planning Academic Press New York

    Google Scholar 

  • Y. Censor (1977) Appl. Math. Optimz, 4 41

    Google Scholar 

  • N.O. Da Cunha E. Polak (1967) J. Math. Anal. Appl., 19 103

    Google Scholar 

  • L.A. Zadeh (1963) IEEE Trans. Automat. Contr. AC-8 1

    Google Scholar 

  • A.K. Smilde A. Knevelman P.M.J. Coenegracht (1968) J. Chromatogr., 369 1

    Google Scholar 

  • A. Höskuldsson (1992) Chemom. Intell. Lab. Syst., 14 139

    Google Scholar 

  • A. Höskuldsson (1996) Chemom. Intell. Lab. Syst., 32 37

    Google Scholar 

  • P.C. Hansen (1990) SIAM Review 34 503

    Google Scholar 

  • Hansen, P.C., In Johnston, P. (Ed.), Computational Inverse Problems in Electrocardiology, WIT Press, South Hampton, 2001.

  • C.L. Lawson R.J. Hanson (1974) Solving Least Squares Problems Prentice-Hall Englewood Cliffs, NJ

    Google Scholar 

  • P.C. Hansen (1990) SIAM J. Sci. Stat. Comput., 11 503

    Google Scholar 

  • K. Faber B.R. Kowalski (1997) J. Chemom., 11 181

    Google Scholar 

  • K. Faber B.R. Kowalski (1996) Chemom. Intell. Lab. Syst., 34 283

    Google Scholar 

  • A. Lorber B.R. Kowalski (1988) J. Chemom., 2 93

    Google Scholar 

  • T. Næs T. Isaksson T. Fern T. Davies (2002) A User Friendly Guide to Multivariate Calibration and Classification NIR Publications Chichester

    Google Scholar 

  • S. Weisberg (1985) Applied Linear Regression Wiley New York

    Google Scholar 

  • P.C. Hansen (1998) Rank-deficient and Discrete Ill-posed Problems: Numerical Aspects of Linear Inversion SIAM Philadelphia, PA

    Google Scholar 

  • A.N. Tikhonov (1963) Soviet Math. Dokl., 4 1035

    Google Scholar 

  • A.E. Hoerl R.W. Kennard (1970) Technometrics 12 55

    Google Scholar 

  • K. Baumann (2003) Trends Anal. Chem. 22 395

    Google Scholar 

  • K. Baumann M. von Korff H. Albert (2002) J. Chemom. 16 351

    Google Scholar 

  • Q.S. Xu Y.Z. Liang (2001) J. Chemom., 56 1

    Google Scholar 

  • P. Burman (1989) Biometrika 76 503

    Google Scholar 

  • J. Shao (1993) J. Am. Statist. Assoc. 88 486

    Google Scholar 

  • B.E. Mattioni P.C. Jurs (2002) J. Chem. Inf. Comput. Sci. 42 94

    Google Scholar 

  • B.E. Mattioni P.C. Jurs (2003) J. Mol. Graph. Model., 21 391

    Google Scholar 

  • I.E. Frank J.H. Friedman (1993) Technometrics, 35 109

    Google Scholar 

  • A. Lorber B.R. Kowalski (1988) J. Chemom., 2 67

    Google Scholar 

  • H. Mark (1991) Principles and Practice of Spectroscopic Calibration Wiley New York

    Google Scholar 

  • Geladi, P., In Andrews, D.L. and Davies, A.M.C. (Eds.), Frontiers in Analytical Spectroscopy, The Royal Society of Chemistry, London, 1995.

  • A. Dax (1992) SIAM J. Optimization, 2 602

    Google Scholar 

  • H.L. Taylor S.C. Banks J.F. McCoy (1979) Geophysics, 44 39

    Google Scholar 

  • F. Santosa W. Symes (1986) SIAM J. Sci. Stat. Comput., 7 1307

    Google Scholar 

  • M. Song C.M. Breneman J. Bi N. Sukumar K.P. Bennett C. Cramer N. Tugcu (2002) J. Chem. Info. Comput. Sci., 42 1347

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John H. Kalivas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalivas, J.H., Forrester, J.B. & Seipel, H.A. QSAR modeling based on the bias/variance compromise: a harmonious. J Comput Aided Mol Des 18, 537–547 (2004). https://doi.org/10.1007/s10822-004-4063-5

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-004-4063-5

Keywords

Navigation