Skip to main content
Log in

The leave-worst-k-out criterion for cross validation

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

Cross validation is widely used to assess the performance of prediction models for unseen data. Leave-k-out and m-fold are among the most popular cross validation criteria, which have complementary strengths and limitations. Leave-k-out (with leave-1-out being the most common special case) is exhaustive and more reliable but computationally prohibitive when \(k > 2\); whereas m-fold is much more tractable at the cost of uncertain performance due to non-exhaustive random sampling. We propose a new cross validation criterion, leave-worst-k-out, which attempts to combine the strengths and avoid limitations of leave-k-out and m-fold. The leave-worst-k-out criterion is defined as the largest validation error out of \(C_{n^{k}}\) possible ways to partition n data points into a subset of \((n-k)\) for training a prediction model and the remaining k for validation. In contrast, the leave-k-out criterion takes the average of the \(C_{n^{k}}\) validation errors from the aforementioned partitions, and m-fold samples m random (but non-independent) such validation errors. We prove that, for the special case of multiple linear regression model under the \({\mathcal {L}}_1\) norm, the leave-worst-k-out criterion can be computed by solving a mixed integer linear program. We also present a random sampling algorithm for approximately computing the criterion for general prediction models under general norms. Results of two computational experiments suggested that the leave-worst-k-out criterion clearly outperformed leave-k-out and m-fold in assessing the generalizability of prediction models; moreover, leave-worst-k-out can be approximately computed using the random sampling algorithm almost as efficiently as leave-1-out and m-fold, and the effectiveness of the approximated criterion may be as high as, or even higher than, the exactly computed criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Hawkins, D.M.: The problem of overfitting. J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004)

    Article  Google Scholar 

  2. Trippa, L., Waldron, L., Huttenhower, C., Parmigiani, G., et al.: Bayesian nonparametric cross-study validation of prediction methods. Ann. Appl. Stat. 9(1), 402–428 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  3. Burnham, K.P., Anderson, D.R., Huyvaert, K.P.: Aic model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav. Ecol. Sociobiol. 65(1), 23–35 (2011)

    Article  Google Scholar 

  4. Candes, E., Tao, T., et al.: The dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)

    MathSciNet  MATH  Google Scholar 

  5. Bartlett, P.L., Long, P.M., Lugosi, G., Tsigler, A.: Benign overfitting in linear regression. In: Proceedings of the National Academy of Sciences. (2020)

  6. Falkner, B., Schröder, G.F.: Cross-validation in cryo-EM-based structural modeling. Proc. Natl. Acad. Sci. 110(22), 8930–8935 (2013)

    Article  Google Scholar 

  7. Scheres, S.H., Chen, S.: Prevention of overfitting in cryo-EM structure determination. Nat. Methods 9(9), 853 (2012)

    Article  Google Scholar 

  8. Vehtari, A., Gelman, A., Gabry, J.: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27(5), 1413–1432 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  9. Celisse, A., et al.: Optimal cross-validation in density estimation with the L\(\{{2}\}\) -loss. Ann. Stat. 42(5), 1879–1910 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Airola, A., Pahikkala, T., Waegeman, W., De Baets, B., Salakoski, T.: An experimental comparison of cross-validation techniques for estimating the area under the roc curve. Comput. Stat. Data Anal. 55(4), 1828–1844 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cawley, G.C., Talbot, N.L.: Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recogn. 36(11), 2585–2592 (2003)

    Article  MATH  Google Scholar 

  12. Homrighausen, D., McDonald, D.J.: Leave-one-out cross-validation is risk consistent for lasso. Mach. Learn. 97(1–2), 65–78 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  13. Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–575 (2009)

    Article  Google Scholar 

  14. Fushiki, T.: Estimation of prediction error by using k-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Blum, A., Kalai, A., and Langford, J. Beating the hold-out: bounds for k-fold and progressive cross-validation. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 203–208. (1999)

  16. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, pp. 1137–1145. (1995)

  17. Magnusson, M., Vehtari, A., Jonasson, J., Andersen, M.: Leave-one-out cross-validation for Bayesian model comparison in large data. In: International Conference on Artificial Intelligence and Statistics, pp. 341–351. PMLR (2020)

  18. Xu, L., Hu, O., Guo, Y., Zhang, M., Lu, D., Cai, C.-B., Xie, S., Goodarzi, M., Fu, H.-Y., She, Y.-B.: Representative splitting cross validation. Chemom. Intell. Lab. Syst. 183, 29–35 (2018)

    Article  Google Scholar 

  19. Jung, Y.: Multiple predicting k-fold cross-validation for model selection. J. Nonparametric Stat. 30(1), 197–215 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  20. Ramezan, A., Warner, A., Maxwell, A.: Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sens. 11(2), 185 (2019)

    Article  Google Scholar 

  21. Duarte, E., Wainer, J.: Empirical comparison of cross-validation and internal metrics for tuning svm hyperparameters. Pattern Recogn. Lett. 88, 6–11 (2017)

    Article  Google Scholar 

  22. Sampath, R., Indumathi, J.: Earlier detection of Alzheimer disease using n-fold cross validation approach. J. Med. Syst. 42(11), 1–11 (2018)

    Article  Google Scholar 

  23. Horvat, T., Havaš, L., Srpak, D.: The impact of selecting a validation method in machine learning on predicting basketball game outcomes. Symmetry 12(3), 431 (2020)

    Article  Google Scholar 

  24. Cossio, P.: Need for cross-validation of single particle cryo-EM. J. Chem. Inf. Model. 60(5), 2413–2418 (2020)

    Article  Google Scholar 

  25. Adnan, R.M., Liang, Z., Yuan, X., Kisi, O., Akhlaq, M., Li, B.: Comparison of lssvr, m5rt, nf-gp, and nf-sc models for predictions of hourly wind speed and wind power based on cross-validation. Energies 12(2), 329 (2019)

    Article  Google Scholar 

  26. Bénichou, M., Gauthier, J.-M., Girodet, P., Hentges, G., Ribière, G., Vincent, O.: Experiments in mixed-integer linear programming. Math. Program. 1(1), 76–94 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  27. Codato, G., Fischetti, M.: Combinatorial benders’ cuts for mixed-integer linear programming. Oper. Res. 54(4), 756–766 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  28. Testa, A., Rucco, A., Notarstefano, G.: Distributed mixed-integer linear programming via cut generation and constraint exchange. IEEE Trans. Autom. Control 65, 1456–1467 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  29. Cplex, I.I.: V12. 1: User’s manual for cplex. Int. Bus. Mach. Corp. 46(53), 157 (2009)

    Google Scholar 

  30. Gurobi Optimization, I. Gurobi Optimizer Reference Manual. URL http://www. gurobi. com. (2018)

  31. Comparative Evaluation of Prediction Algorithms, C. http://www.coepra.org/CoEPrA_regr.html. (2006)

  32. Mitteroecker, P., Cheverud, J., Pavlicev, M.: Multivariate analysis of genotype-phenotype association. Genetics 202(4), 1345–1363 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Science Foundation under the LEAP HI and GOALI programs (Grant Number 1830478) and under the EAGER program (Grant Number 1842097) and the Plant Sciences Institute at Iowa State University. This manuscript was greatly improved thanks to constructive and insightful feedback from the Associate Editor and an anonymous reviewer. The author is grateful to Dr. Qing Li and Lijie Liu for the suggestion of the CoEPrA data source and to Dr. Guiping Hu and Dr. Dan Nettleton for inspiring conversations about the proposed LWKO criterion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lizhi Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L. The leave-worst-k-out criterion for cross validation. Optim Lett 17, 545–560 (2023). https://doi.org/10.1007/s11590-022-01894-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-022-01894-6

Keywords