Skip to main content

Bootstrap and Cross-Validation to Assess Complexity of Data-Driven Regression Models

  • Conference paper
  • First Online:
Medical Data Analysis (ISMDA 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1933))

Included in the following conference series:

Abstract

The number of potential variables included into a regression model is often too large and a more parsimonious model may be preferable. Selection strategies are widely used, but there are few analytical results about their properties. To investigate problems as replication stability, model complexity and selection bias we use bootstrap and cross-validation methods. For stepwise strategies, we discuss the importance of the predefined selection level. The methods are illustrated by investigating prognostic factors for survival time of patients with malignant glioma in the framework of a Cox regression model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Better Subset Regression Using the Nonnegative Garotte. Technometrics 37 (1995) 373–384

    Article  MATH  MathSciNet  Google Scholar 

  2. Buckland, S.T., Burnham, K.P., Augustin, N.H.: Model Selection: An Integral Part Of Inference. Biometrics 53 (1997) 603–618

    Article  MATH  Google Scholar 

  3. Chatfield, C.: Model Uncertainty, Data Mining and Statistical Inference (With Discussion). J. R. Statist. Soc. A 158 (1995) 419–466

    Article  Google Scholar 

  4. Chen, C.H., George, S.L.: The Bootstrap and Identification of Prognostic Factors via Cox’s Proportional Hazards Regression Model. Stat. Med. 4 (1985) 39–46

    Article  Google Scholar 

  5. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman and Hall, London (1993)

    MATH  Google Scholar 

  6. Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C. T.: Bayesian Model Averaging: A Tutorial. Stat. Science 14 (1999) 382–417

    Article  MATH  MathSciNet  Google Scholar 

  7. Marubini, E., Valsecchi, M.G.: Analying Survivial Data from Clinical Trials and Observationals Studies. W. Chickster (1994)

    Google Scholar 

  8. Miller, A.J.: Subset Selection in Regression. Chapman and Hall, London (1990)

    MATH  Google Scholar 

  9. Sauerbrei, W.: Comparison of Variable Selection Procedures in Regression Models-a Simulation Study and Practical Examples. In: Europäische Perspektiven der Medizinischen Informatik, Biometrie und Epidemiologie (eds. J. Michaelis, G. Hommel and S. Wellek) pp. 108–113. Munich (1993), MMV Medizin Verlag

    Google Scholar 

  10. Sauerbrei, W.: The Use of Resampling Methods to Simplify Regression Models in Medical Statistics. Appl. Stat. 48 (1999) 313–329

    MATH  Google Scholar 

  11. Sauerbrei, W., Schumacher, M.: A Bootstrap Resampling Procedure for Model Building: Application to the Cox Regression Model. Stat. Med. 11 (1992) 2093–2109

    Article  Google Scholar 

  12. Schumacher, M., Holländer, N., Sauerbrei W.: Resampling and Cross-Validation Techniques: a Tool to Reduce Bias Caused by Model Building? Stat. Med. 16 (1997) 2813–2827

    Article  Google Scholar 

  13. Teräsvirta, T., Mellin, I.: Model Selection Criteria and Model Selection Tests in Regression Models. Scand. J. Stat., 13 (1986) 159–171

    MATH  Google Scholar 

  14. Tibshirani, R.: Regression Shrinkage and Selection via Lasso. J. R. Statist. Soc. B 58 (1996) 267–288

    MATH  MathSciNet  Google Scholar 

  15. Ulm, K., Schmoor, C., Sauerbrei, W., Kemmler, G., Aydemir, Ü., Müller, B, Schumacher, M.: Strategien zur Auswertung einer Therapiestudie mit der überlebenszeit als Zielkriterium. Biometr. Inform. Med. Biol. 20 (1989) 171–205

    Google Scholar 

  16. Van Houwelingen, J.C., le Cessie, S.: Predictive Value of Statistical Models. Stat. Med. 9 (1990) 1303–1325

    Article  Google Scholar 

  17. Verweij, P.J.M., Van Houwelingen, H.C.: Crossvalidation in Survival Analysis. Stat. Med. 9 (1993) 487–503

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sauerbrei, W., Schumacher1, M. (2000). Bootstrap and Cross-Validation to Assess Complexity of Data-Driven Regression Models. In: Brause, R.W., Hanisch, E. (eds) Medical Data Analysis. ISMDA 2000. Lecture Notes in Computer Science, vol 1933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39949-6_29

Download citation

  • DOI: https://doi.org/10.1007/3-540-39949-6_29

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41089-8

  • Online ISBN: 978-3-540-39949-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics