Abstract
Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, evidence was shown that these approaches systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The R code to reproduce all of our experiments is available at https://github.com/vcerqueira/MLforForecasting.
Similar content being viewed by others
Data Availability
All experiments and data are publicly available (c.f. abstract)
References
Ahmed, N. K., Atiya, A. F., Gayar, N. E., & El-Shishiny, H. (2010). An empirical comparison of machine learning models for time series forecasting. Econometric Reviews, 29(5-6), 594–621.
Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521–530.
Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. New York: John Wiley & Sons.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Carbonneau, R., Laframboise, K., & Vahidov, R. (2008). Application of machine learning techniques for supply chain demand forecasting. European Journal of Operational Research, 184(3), 1140–1154.
Cerqueira, V., Torgo, L., Pinto, F., & Soares, C. (2019). Arbitrage of forecasting experts. Machine Learning, 108(6), 913–944.
Chatfield, C. (2000). Time-series forecasting. CRC Press.
Cleveland, W. S., Grosse, E., & Shyu, W. M. (2017). Local regression models. In Statistical models in s, pp. 309–376. Routledge.
Cox, D. R., & Stuart, A. (1955). Some quick sign tests for trend in location and dispersion. Biometrika, 42(1/2), 80–95.
Dawid, A. P. (1984). Present position and potential developments: Some personal views statistical theory the prequential approach. Journal of the Royal Statistical Society: Series A (General), 147(2), 278–290.
De Livera, A. M., Hyndman, R. J., & Snyder, R. D. (2011). Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, 106(496), 1513–1527.
Dietterich, T. G. (2002). Machine learning for sequential data: a review. In Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR), pp. 15–30. Springer.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.
Friedman, J. H., et al. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.
Gama, J. (2010). Knowledge discovery from data streams. Chapman and hall/CRC.
Gardner, E. S. Jr (1985). Exponential smoothing: The state of the art. Journal of Forecasting, 4(1), 1–28.
Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning, vol. 1 MIT Press Cambridge.
Guerrero, V. M. (1993). Time-series analysis supported by power transformations. Journal of Forecasting, 12(1), 37–48.
Hill, T., O’Connor, M., & Remus, W. (1996). Neural network models for time series forecasts. Management Science, 42(7), 1082–1092.
Hyndman, R., & Yang, Y. (2019). tsdl: Time Series Data Library. https://finyang.github.io/tsdl/.
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice OTexts.
Hyndman, R.J. with contributions from George Athanasopoulos, Razbash, S., Schmidt, D., Zhou, Z., Khan, Y., Bergmeir, C., & Wang, E. (2014). forecast: Forecasting functions for time series and linear models. R package version 5.6.
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688.
Januschowski, T., Gasthaus, J., Wang, Y., Salinas, D., Flunkert, V., Bohlke-Schneider, M., & Callot, L. (2020). Criteria for classifying forecasting methods. International Journal of Forecasting, 36(1), 167–177.
Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab – an S4 package for kernel methods in R. Journal of Statistical Software, 11 (9), 1–20.
Kennel, M. B., Brown, R., & Abarbanel, H. D. (1992). Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A, 45(6), 3403.
Kilian, L., & Taylor, M. P. (2003). Why is it so difficult to beat the random walk forecast of exchange rates? Journal of International Economics, 60(1), 85–107.
Kuhn, M., Weston, S., & Keefer, C. (2014). code for Cubist by Ross Quinlan, N.C.C.: Cubist: rule- and instance-based regression modeling. R package version 0.0.18.
Lee, J., & Mark, R. G. (2010). An investigation of patterns in hemodynamic data indicative of impending hypotension in intensive care. Biomedical Engineering Online, 9(1), 62.
Makridakis, S., & Hibon, M. (1997). Arma models and the box–jenkins methodology. Journal of Forecasting, 16(3), 147–163.
Makridakis, S., & Hibon, M. (2000). The m3-competition: results, conclusions and implications. International Journal of Forecasting, 16(4), 451–476.
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and machine learning forecasting methods: Concerns and ways forward. PloS One, 13(3), e0194,889.
McCullagh, P. (2019). Generalized linear models. Routledge.
Michalski, R., Carbonell, J., & Mitchell, T. (1983). Machine learning: An artificial intelligence approach.
Milborrow, S. (2016). earth: Multivariate adaptive regression splines. R package version 4.4.4.
Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y. (2019). N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv:1905.10437.
Provost, F., Jensen, D., & Oates, T. (1999). Efficient progressive sampling. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 23–32. ACM.
Quinlan, J. R. (1993). Combining instance-based and model-based learning. In Proceedings of the tenth international conference on machine learning, pp. 236–243.
Spiliotis, E., Makridakis, S., Semenoglou, A. A., & Assimakopoulos, V. (2020). Comparison of statistical and machine learning methods for daily sku demand forecasting. Operational Research, 1–25.
Taieb, S. B., Bontempi, G., Atiya, A. F., & Sorjamaa, A. (2012). A review and comparison of strategies for multi-step ahead time series forecasting based on the nn5 forecasting competition. Expert Systems with Applications, 39(8), 7067–7083.
Takens, F. (1981). Dynamical Systems and Turbulence. In Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80, chap. Detecting strange attractors in turbulence (pp. 366–381). Berlin: Springer.
Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37–45.
Voyant, C., Notton, G., Kalogirou, S., Nivet, M. L., Paoli, C., Motte, F., & Fouilloy, A. (2017). Machine learning methods for solar radiation forecasting: a review. Renewable Energy, 105, 569–582.
Wang, X., Smith, K., & Hyndman, R. (2006). Characteristic-based clustering for time series data. Data Mining and Knowledge Discovery, 13(3), 335–364.
Weigend, A. S. (2018). Time series prediction: forecasting the future and understanding the past. Routledge.
Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390.
Wright, M. N. (2015). Ranger: A Fast Implementation of Random Forests. R package.
Xingjian, S., Chen, Z., Wang, H., Yeung, D. Y., Wong, W. K., & Woo, W.C. (2015). Convolutional lstm network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems, pp. 802–810.
Funding
The work of L. Torgo was undertaken, in part, thanks to funding from the Canada Research Chairs program
Author information
Authors and Affiliations
Contributions
All authors contributed to writing and research.
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no relevant financial or non- financial interests to disclose
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cerqueira, V., Torgo, L. & Soares, C. A case study comparing machine learning with statistical methods for time series forecasting: size matters. J Intell Inf Syst 59, 415–433 (2022). https://doi.org/10.1007/s10844-022-00713-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-022-00713-9