Skip to main content
Log in

Model Selection for Time Series Forecasting An Empirical Analysis of Multiple Estimators

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Evaluating predictive models is a crucial task in predictive analytics. This process is especially challenging with time series data because observations are not independent. Several studies have analyzed how different performance estimation methods compare with each other for approximating the true loss incurred by a given forecasting model. However, these studies do not address how the estimators behave for model selection: the ability to select the best solution among a set of alternatives. This paper addresses this issue. The goal of this work is to compare a set of estimation methods for model selection in time series forecasting tasks. This objective is split into two main questions: (i) analyze how often a given estimation method selects the best possible model; and (ii) analyze what is the performance loss when the best model is not selected. Experiments were carried out using a case study that contains 3111 time series. The accuracy of the estimators for selecting the best solution is low, despite being significantly better than random selection. Moreover, the overall forecasting performance loss associated with the model selection process ranges from 0.28 to 0.58%. Yet, no considerable differences between different approaches were found. Besides, the sample size of the time series is an important factor in the relative performance of the estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availibility

All experiments and data are publicly available (c.f. footnote 1)

Notes

  1. https://github.com/vcerqueira/model_selection_forecasting.

  2. https://github.com/Mcompetitions/M5-methods.

  3. https://robjhyndman.com/hyndsight/tscv/.

  4. The model_selection module from the scikit-learn Python library designates this method as TimeSeriesSplits.

References

  1. Breiman L, Spector P (1992) Submodel selection and evaluation in regression. The x-random case. International statistical review/revue internationale de Statistique pp. 291–319

  2. Arlot S, Celisse A et al (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79

    Article  MathSciNet  MATH  Google Scholar 

  3. Bergmeir C, Benítez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf Sci 191:192–213

    Article  Google Scholar 

  4. Bergmeir C, Hyndman RJ, Koo B (2018) A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal 120:70–83

    Article  MathSciNet  MATH  Google Scholar 

  5. Cerqueira V, Torgo L, Mozetič I (2020) Evaluating time series forecasting models: an empirical study on performance estimation methods. Mach Learn 109:1–32

    Article  MathSciNet  Google Scholar 

  6. Tashman LJ (2000) Out-of-sample tests of forecasting accuracy: an analysis and review. Int J Forecast 16(4):437–450

    Article  Google Scholar 

  7. Mozetič I, Torgo L, Cerqueira V, Smailović J (2018) How to evaluate sentiment classifiers for twitter time-ordered data? PLoS ONE 13(3):e0194,317

    Article  Google Scholar 

  8. Yang Y (2007) Consistency of cross validation for comparing regression procedures. Ann Stat 35(6):2450–2473

    Article  MathSciNet  MATH  Google Scholar 

  9. Dawid AP (1984) Present position and potential developments: Some personal views statistical theory the prequential approach. J R Stat Soc Ser A (General) 147(2):278–290

    Article  Google Scholar 

  10. Opsomer J, Wang Y, Yang Y (2001) Nonparametric regression with correlated errors. Stat Sci 16(2):134–153

  11. Snijders TA (1988) On model uncertainty and its statistical implications. Springer, pp 56–69

  12. McQuarrie AD, Tsai CL (1998) Regression and time series model selection. World Scientific

  13. Racine J (2000) Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. J Econ 99(1):39–61

    Article  MATH  Google Scholar 

  14. Gama J, Rodrigues PP, Sebastião R (2009) In: Proceedings of the 2009 ACM symposium on Applied Computing, pp 1496–1500

  15. Makridakis S, Spiliotis E, Assimakopoulos V (2018) Statistical and machine learning forecasting methods: Concerns and ways forward. PLoS ONE 13(3):e0194,889

    Article  Google Scholar 

  16. Chatfield C (2000) Time-series forecasting. CRC press

  17. Gardner ES Jr (1985) Exponential smoothing: the state of the art. J Forecast 4(1):1–28

    Article  MathSciNet  Google Scholar 

  18. Spiliotis E, Makridakis S, Semenoglou AA, Assimakopoulos V (2022) Comparison of statistical and machine learning methods for daily sku demand forecasting. Oper Res 22(3):3037–3061

  19. Cerqueira V, Torgo L, Soares C (2022) A case study comparing machine learning with statistical methods for time series forecasting: size matters. J Intell Inf Syst 59:1–19

    Article  Google Scholar 

  20. Makridakis S, Spiliotis E, Assimakopoulos V (2020) The m5 accuracy competition: results, findings and conclusions. Int J Forecast 38:1346

    Article  Google Scholar 

  21. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) In: Advances in neural information processing systems, pp 3146–3154

  22. Cerqueira V, Torgo L, Oliveira M, Pfahringer B (2017) In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA) (IEEE, 2017), pp 242–251

  23. Cerqueira V, Torgo L, Pinto F, Soares C (2019) Arbitrage of forecasting experts. Mach Learn 108(6):913–944

    Article  MathSciNet  MATH  Google Scholar 

  24. Corani G, Benavoli A, Augusto J, Zaffalon M (2020) Automatic forecasting using gaussian processes. arXiv preprint arXiv:2009.08102

  25. Oreshkin BN, Carpov D, Chapados N, Bengio Y (2019) N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437

  26. Salinas D, Flunkert V, Gasthaus J, Januschowski T (2020) Deepar: Probabilistic forecasting with autoregressive recurrent networks. Int J Forecast 36(3):1181–1191

    Article  Google Scholar 

  27. Smyl S (2020) A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. Int J Forecast 36(1):75–85

    Article  Google Scholar 

  28. Lim B, Arık SÖ, Loeff N, Pfister T (2021) Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int J Forecast 37(4):1748–1764

    Article  Google Scholar 

  29. Chen MR, Zeng GQ, Lu KD, Weng J (2019) A two-layer nonlinear combination method for short-term wind speed prediction based on elm, enn, and lstm. IEEE Internet Things J 6(4):6997–7010

    Article  Google Scholar 

  30. Zhao F, Zeng GQ, Lu KD (2019) Enlstm-wpeo: Short-term traffic flow prediction by ensemble lstm, nnct weight integration, and population extremal optimization. IEEE Trans Veh Technol 69(1):101–113

    Article  Google Scholar 

  31. Taylor SJ, Letham B (2018) Forecasting at scale. Am Stat 72(1):37–45

    Article  MathSciNet  MATH  Google Scholar 

  32. Triebe O, Hewamalage H, Pilyugina P, Laptev N, Bergmeir C, Rajagopal R (2021) Neuralprophet: Explainable forecasting at scale. arXiv preprint arXiv:2111.15397

  33. Bandara K, Hewamalage H, Liu YH, Kang Y, Bergmeir C (2021) Improving the accuracy of global forecasting models using time series data augmentation. Pattern Recogn 120:108,148

    Article  Google Scholar 

  34. Hewamalage H, Bergmeir C, Bandara K (2022) Global models for time series forecasting: A simulation study. Pattern Recogn 124:108,441

    Article  Google Scholar 

  35. Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Rev A 45(6):3403

    Article  Google Scholar 

  36. Brazdil PB, Soares C (2000) European conference on machine learning. Springer, pp 63–75

  37. Benavoli A, Corani G, Mangili F (2016) Should we really use post-hoc tests based on mean-ranks? J Mach Learn Res 17(1):152–161

    MathSciNet  MATH  Google Scholar 

  38. Abdulrahman SM, Brazdil P, van Rijn JN, Vanschoren J (2018) Speeding up algorithm selection using average ranking and active testing by introducing runtime. Mach Learn 107(1):79–108

    Article  MathSciNet  MATH  Google Scholar 

  39. Makridakis S, Spiliotis E, Assimakopoulos V (2020) The m4 competition: 100,000 time series and 61 forecasting methods. Int J Forecast 36(1):54–74

    Article  Google Scholar 

  40. Hyndman R, Yang Y (2019) tsdl: Time series data library. https://finyang.github.io/tsdl/, https://github.com/FinYang/tsdl

  41. Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab-an s4 package for kernel methods in r. J Stat Softw 11(9):1–20

    Article  Google Scholar 

  42. Milborrow S (2012) earth: multivariate adaptive regression spline models

  43. Wright MN (2015) ranger: a fast implementation of random forests. R package

  44. Friedman JH, Stuetzle W (1981) Projection pursuit regression. J Am Stat Assoc 76(376):817–823

    Article  MathSciNet  Google Scholar 

  45. Kuhn M, Weston S, Keefer C (2014) N.C.C. code for Cubist by Ross Quinlan, Cubist: rule- and instance-based regression modeling. R package version 0.0.18

  46. Cannon AJ (2017) monmlp: Multi-layer perceptron neural network with optional monotonicity constraints. https://CRAN.R-project.org/package=monmlp. R package version 1.1.5

  47. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22

    Article  Google Scholar 

  48. Mevik BH, Wehrens R, Liland KH (2016) pls: partial least squares and principal component regression. https://CRAN.R-project.org/package=pls. R package version 2.6-0

  49. Chen T, Guestrin C (2016) In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794

  50. Picard RR, Cook RD (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–583

    Article  MathSciNet  MATH  Google Scholar 

  51. Jain CL (2017) Answers to your forecasting questions. J Bus Forecast 36(1):3

    Google Scholar 

  52. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

Download references

Funding

The work of L. Torgo was undertaken, in part, thanks to funding from the Canada Research Chairs program; the work of Carlos Soares was partially funded by projects ConnectedHealth (no. 46858), supported by Competitiveness and Internationalisation Operational Programme (POCI) and Lisbon Regional Operational Programme (LISBOA 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), by the project Safe Cities - Inovação para Construir Cidades Seguras, with the reference POCI-01-0247-FEDER-041435, co-funded by the European Regional Development Fund (ERDF), through the Operational Programme for Competitiveness and Internationalization (COMPETE 2020), under the PORTU- GAL 2020 Partnership Agreement, by project NextGenAI - Center for Responsible AI (2022-C05i0102-02), supported by IAPMEI, and also by FCT plurianual funding for 2020–2023 of LIACC (UIDB/00027/2020_UIDP/00027/202

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to writing and research.

Corresponding author

Correspondence to Vitor Cerqueira.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose

Consent to participate

Not applicable

Consent for publication

Not applicable

Ethics approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cerqueira, V., Torgo, L. & Soares, C. Model Selection for Time Series Forecasting An Empirical Analysis of Multiple Estimators. Neural Process Lett 55, 10073–10091 (2023). https://doi.org/10.1007/s11063-023-11239-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11239-8

Keywords

Navigation