Skip to main content
Log in

A case study comparing machine learning with statistical methods for time series forecasting: size matters

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, evidence was shown that these approaches systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The R code to reproduce all of our experiments is available at https://github.com/vcerqueira/MLforForecasting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

All experiments and data are publicly available (c.f. abstract)

References

  • Ahmed, N. K., Atiya, A. F., Gayar, N. E., & El-Shishiny, H. (2010). An empirical comparison of machine learning models for time series forecasting. Econometric Reviews, 29(5-6), 594–621.

    Article  MathSciNet  Google Scholar 

  • Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521–530.

    Article  Google Scholar 

  • Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. New York: John Wiley & Sons.

    MATH  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Carbonneau, R., Laframboise, K., & Vahidov, R. (2008). Application of machine learning techniques for supply chain demand forecasting. European Journal of Operational Research, 184(3), 1140–1154.

    Article  Google Scholar 

  • Cerqueira, V., Torgo, L., Pinto, F., & Soares, C. (2019). Arbitrage of forecasting experts. Machine Learning, 108(6), 913–944.

    Article  MathSciNet  Google Scholar 

  • Chatfield, C. (2000). Time-series forecasting. CRC Press.

  • Cleveland, W. S., Grosse, E., & Shyu, W. M. (2017). Local regression models. In Statistical models in s, pp. 309–376. Routledge.

  • Cox, D. R., & Stuart, A. (1955). Some quick sign tests for trend in location and dispersion. Biometrika, 42(1/2), 80–95.

    Article  MathSciNet  Google Scholar 

  • Dawid, A. P. (1984). Present position and potential developments: Some personal views statistical theory the prequential approach. Journal of the Royal Statistical Society: Series A (General), 147(2), 278–290.

    Article  MathSciNet  Google Scholar 

  • De Livera, A. M., Hyndman, R. J., & Snyder, R. D. (2011). Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, 106(496), 1513–1527.

    Article  MathSciNet  Google Scholar 

  • Dietterich, T. G. (2002). Machine learning for sequential data: a review. In Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR), pp. 15–30. Springer.

  • Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.

    Article  Google Scholar 

  • Friedman, J. H., et al. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.

    MathSciNet  MATH  Google Scholar 

  • Gama, J. (2010). Knowledge discovery from data streams. Chapman and hall/CRC.

  • Gardner, E. S. Jr (1985). Exponential smoothing: The state of the art. Journal of Forecasting, 4(1), 1–28.

    Article  Google Scholar 

  • Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning, vol. 1 MIT Press Cambridge.

  • Guerrero, V. M. (1993). Time-series analysis supported by power transformations. Journal of Forecasting, 12(1), 37–48.

    Article  Google Scholar 

  • Hill, T., O’Connor, M., & Remus, W. (1996). Neural network models for time series forecasts. Management Science, 42(7), 1082–1092.

    Article  Google Scholar 

  • Hyndman, R., & Yang, Y. (2019). tsdl: Time Series Data Library. https://finyang.github.io/tsdl/.

  • Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice OTexts.

  • Hyndman, R.J. with contributions from George Athanasopoulos, Razbash, S., Schmidt, D., Zhou, Z., Khan, Y., Bergmeir, C., & Wang, E. (2014). forecast: Forecasting functions for time series and linear models. R package version 5.6.

  • Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688.

    Article  Google Scholar 

  • Januschowski, T., Gasthaus, J., Wang, Y., Salinas, D., Flunkert, V., Bohlke-Schneider, M., & Callot, L. (2020). Criteria for classifying forecasting methods. International Journal of Forecasting, 36(1), 167–177.

    Article  Google Scholar 

  • Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab – an S4 package for kernel methods in R. Journal of Statistical Software, 11 (9), 1–20.

    Article  Google Scholar 

  • Kennel, M. B., Brown, R., & Abarbanel, H. D. (1992). Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A, 45(6), 3403.

    Article  Google Scholar 

  • Kilian, L., & Taylor, M. P. (2003). Why is it so difficult to beat the random walk forecast of exchange rates? Journal of International Economics, 60(1), 85–107.

    Article  Google Scholar 

  • Kuhn, M., Weston, S., & Keefer, C. (2014). code for Cubist by Ross Quinlan, N.C.C.: Cubist: rule- and instance-based regression modeling. R package version 0.0.18.

  • Lee, J., & Mark, R. G. (2010). An investigation of patterns in hemodynamic data indicative of impending hypotension in intensive care. Biomedical Engineering Online, 9(1), 62.

    Article  Google Scholar 

  • Makridakis, S., & Hibon, M. (1997). Arma models and the box–jenkins methodology. Journal of Forecasting, 16(3), 147–163.

    Article  Google Scholar 

  • Makridakis, S., & Hibon, M. (2000). The m3-competition: results, conclusions and implications. International Journal of Forecasting, 16(4), 451–476.

    Article  Google Scholar 

  • Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and machine learning forecasting methods: Concerns and ways forward. PloS One, 13(3), e0194,889.

    Article  Google Scholar 

  • McCullagh, P. (2019). Generalized linear models. Routledge.

  • Michalski, R., Carbonell, J., & Mitchell, T. (1983). Machine learning: An artificial intelligence approach.

  • Milborrow, S. (2016). earth: Multivariate adaptive regression splines. R package version 4.4.4.

  • Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y. (2019). N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv:1905.10437.

  • Provost, F., Jensen, D., & Oates, T. (1999). Efficient progressive sampling. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 23–32. ACM.

  • Quinlan, J. R. (1993). Combining instance-based and model-based learning. In Proceedings of the tenth international conference on machine learning, pp. 236–243.

  • Spiliotis, E., Makridakis, S., Semenoglou, A. A., & Assimakopoulos, V. (2020). Comparison of statistical and machine learning methods for daily sku demand forecasting. Operational Research, 1–25.

  • Taieb, S. B., Bontempi, G., Atiya, A. F., & Sorjamaa, A. (2012). A review and comparison of strategies for multi-step ahead time series forecasting based on the nn5 forecasting competition. Expert Systems with Applications, 39(8), 7067–7083.

    Article  Google Scholar 

  • Takens, F. (1981). Dynamical Systems and Turbulence. In Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80, chap. Detecting strange attractors in turbulence (pp. 366–381). Berlin: Springer.

  • Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37–45.

    Article  MathSciNet  Google Scholar 

  • Voyant, C., Notton, G., Kalogirou, S., Nivet, M. L., Paoli, C., Motte, F., & Fouilloy, A. (2017). Machine learning methods for solar radiation forecasting: a review. Renewable Energy, 105, 569–582.

    Article  Google Scholar 

  • Wang, X., Smith, K., & Hyndman, R. (2006). Characteristic-based clustering for time series data. Data Mining and Knowledge Discovery, 13(3), 335–364.

    Article  MathSciNet  Google Scholar 

  • Weigend, A. S. (2018). Time series prediction: forecasting the future and understanding the past. Routledge.

  • Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390.

    Article  Google Scholar 

  • Wright, M. N. (2015). Ranger: A Fast Implementation of Random Forests. R package.

  • Xingjian, S., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., & Woo, W.C. (2015). Convolutional lstm network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems, pp. 802–810.

Download references

Funding

The work of L. Torgo was undertaken, in part, thanks to funding from the Canada Research Chairs program

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to writing and research.

Corresponding author

Correspondence to Vitor Cerqueira.

Ethics declarations

Conflict of Interests

The authors have no relevant financial or non- financial interests to disclose

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cerqueira, V., Torgo, L. & Soares, C. A case study comparing machine learning with statistical methods for time series forecasting: size matters. J Intell Inf Syst 59, 415–433 (2022). https://doi.org/10.1007/s10844-022-00713-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-022-00713-9

Keywords

Navigation