Skip to main content
Log in

Comparison of statistical and machine learning methods for daily SKU demand forecasting

  • Original Paper
  • Published:
Operational Research Aims and scope Submit manuscript

Abstract

Daily SKU demand forecasting is a challenging task as it usually involves predicting irregular series that are characterized by intermittency and erraticness. This is particularly true when forecasting at low cross-sectional levels, such as at a store or warehouse level, or dealing with slow-moving items. Yet, accurate forecasts are necessary for supporting inventory holding and replenishment decisions. This task is typically addressed by utilizing well-established statistical methods, such as the Croston’s method and its variants. More recently, Machine Learning (ML) methods have been proposed as an alternative to statistical ones, but their superiority remains under question. This paper sheds some light in that direction by comparing the forecasting performance of various ML methods, trained both in a series-by-series and a cross-learning fashion, to that of statistical methods using a large set of real daily SKU demand data. Our results indicate that some ML methods do provide better forecasts, both in terms of accuracy and bias. Cross-learning across multiple SKUs has also proven to be beneficial for some of the ML methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abolghasemi M, Beh E, Tarr G, Gerlach R (2020) Demand forecasting in supply chain: the impact of demand volatility in the presence of promotion. Comput Ind Eng 142:106380

    Google Scholar 

  • Ali ÖG, Sayın S, van Woensel T, Fransoo J (2009) SKU demand forecasting in the presence of promotions. Expert Syst Appl 36:12340–12348

    Google Scholar 

  • Babai M, Dallery Y, Boubaker S, Kalai R (2019) A new method to forecast intermittent demand in the presence of inventory obsolescence. Int J Prod Econ 209:30–41

    Google Scholar 

  • Barker J (2020) Machine learning in M4: what makes a good unstructured model? Int J Forecast 36:150–155

    Google Scholar 

  • Bergmeir C, Benítez JM (2012) Neural networks in R using the stuttgart neural network simulator: RSNNS. J Stat Softw 46:1–26

    Google Scholar 

  • Bojer CS, Meldgaard JP (2020) Kaggle forecasting competitions: an overlooked learning opportunity. Int J Forecast. https://doi.org/10.1016/j.ijforecast.2020.07.007

  • Boutselis P, McNaught K (2019) Using Bayesian networks to forecast spares demand from equipment failures in a changing service logistics context. Int J Prod Econ 209:325–333

    Google Scholar 

  • Boylan JE, Syntetos AA (2009) Spare parts management: a review of forecasting research and extensions. IMA J Manag Math 21:227–237

    Google Scholar 

  • Boylan JE, Syntetos AA, Karakostas GC (2008) Classification for forecasting and stock control: a case study. J Oper Res Soc 59:473–481

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Google Scholar 

  • Brown RG (1959) Statistical forecasting for inventory control. McGraw-Hill, New York

    Google Scholar 

  • Carmo JL, Rodrigues AJ (2004) Adaptive forecasting of irregular demand processes. Eng Appl Artif Intell 17:137–143

    Google Scholar 

  • Chapados N (2014) Effective Bayesian modeling of groups of related count time series. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning. PMLR volume 32 of proceedings of machine learning research, Bejing, China, pp 1395–1403

  • Chen H, Boylan JE (2008) Empirical evidence on individual, group and shrinkage seasonal indices. Int J Forecast 24:525–534

    Google Scholar 

  • Croston JD (1972) Forecasting and stock control for intermittent demands. J Oper Res Soc 23:289–303

    Google Scholar 

  • Dan Foresee F, Hagan MT (1997) Gauss–Newton approximation to Bayesian learning. In: IEEE international conference on neural networks-conference proceedings, vol 3, pp 1930–1935

  • Davydenko A, Fildes R (2013) Measuring forecasting accuracy: the case of judgmental adjustments to SKU-level demand forecasts. Int J Forecast 29:510–522

    Google Scholar 

  • Eaves AHC, Kingsman BG (2004) Forecasting for the ordering and stock-holding of spare parts. J Oper Res Soc 55:431–437

    Google Scholar 

  • Fildes R (1992) The evaluation of extrapolative forecasting methods. Int J Forecast 8:81–98

    Google Scholar 

  • Franses PH (2016) A note on the mean absolute scaled error. Int J Forecast 32:20–22

    Google Scholar 

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139

    Google Scholar 

  • Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378

    Google Scholar 

  • Gardner ES Jr (1985) Exponential smoothing: the state of the art. J Forecast 4:1–28

    Google Scholar 

  • Gardner ES (2006) Exponential smoothing: the state of the art part II. Int J Forecast 22:637–666

    Google Scholar 

  • Ghobbar AA, Friend CH (2003) Evaluation of forecasting methods for intermittent parts demand in the field of aviation: a predictive model. Comput Oper Res 30:2097–2114

    Google Scholar 

  • Greenwell B, Boehmke B, Cunningham J, Developers G (2019) gbm: Generalized Boosted Regression Models. R package version 2.1.5

  • Gutierrez RS, Solis AO, Mukhopadhyay S (2008) Lumpy demand forecasting using neural networks. Int J Prod Econ 111:409–420

    Google Scholar 

  • Hasni M, Aguir M, Babai M, Jemai Z (2019) On the performance of adjusted bootstrapping methods for intermittent demand forecasting. Int J Prod Econ 216:145–153

    Google Scholar 

  • Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366

    Google Scholar 

  • Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22:679–688

    Google Scholar 

  • Hyndman RJ, Koehler AB, Snyder RD, Grose S (2002) A state space framework for automatic forecasting using exponential smoothing methods. Int J Forecast 18:439–454

    Google Scholar 

  • Januschowski T, Gasthaus J, Wang Y, Salinas D, Flunkert V, Bohlke-Schneider M, Callot L (2020) Criteria for classifying forecasting methods. Int J Forecast 36:167–177

    Google Scholar 

  • Johnston FR, Boylan JE, Shale EA (2003) An examination of the size of orders from customers, their characterisation and the implications for inventory control of slow moving items. J Oper Res Soc 54:833–837

    Google Scholar 

  • Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab: an S4 package for kernel methods in R. J Stat Softw 11:1–20

    Google Scholar 

  • Kolassa S (2016) Evaluating predictive count data distributions in retail sales forecasting. Int J Forecast 32:788–803

    Google Scholar 

  • Koning AJ, Franses PH, Hibon M, Stekler HO (2005) The M3 competition: statistical tests of the results. Int J Forecast 21:397–409

    Google Scholar 

  • Kourentzes N (2013) Intermittent demand forecasts with neural networks. Int J Prod Econ 143:198–206

    Google Scholar 

  • Kourentzes N (2014) On intermittent demand model optimisation and selection. Int J Prod Econ 156:180–190

    Google Scholar 

  • Kourentzes N, Barrow DK, Crone SF (2014a) Neural network ensemble operators for time series forecasting. Expert Syst Appl 41:4235–4244

    Google Scholar 

  • Kourentzes N, Petropoulos F, Trapero JR (2014b) Improving forecasting by estimating time series structural components across multiple frequencies. Int J Forecast 30:291–302

    Google Scholar 

  • Kuhn M (2018) caret: Classification and Regression Training. R package version 6.0-81

  • Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2:18–22

    Google Scholar 

  • Lippmann RP (1987) An introduction to computing with neural nets. IEEE ASSP Mag 4:4–22

    Google Scholar 

  • Lolli F, Gamberini R, Regattieri A, Balugani E, Gatos T, Gucci S (2017) Single-hidden layer neural networks for forecasting intermittent demand. Int J Prod Econ 183:116–128

    Google Scholar 

  • MacKay DJC (1992) Bayesian interpolation. Neural Comput 4:415–447

    Google Scholar 

  • Makridakis S, Spiliotis E, Assimakopoulos V (2018) Statistical and machine learning forecasting methods: concerns and ways forward. PLoS ONE 13:1–26

    Google Scholar 

  • Makridakis S, Hyndman RJ, Petropoulos F (2020a) Forecasting in social settings: the state of the art. Int J Forecast 36:15–28

    Google Scholar 

  • Makridakis S, Spiliotis E, Assimakopoulos V (2020b) The M4 competition: 100,000 time series and 61 forecasting methods. Int J Forecast 36:54–74

    Google Scholar 

  • Makridakis S, Spiliotis E, Assimakopoulos V (2020c) The M5 competition: competitors guide. https://mofc.unic.ac.cy/m5-competition/. Accessed 01 Sept 2020

  • Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2019) e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.7-1

  • Mohammadipour M, Boylan J, Syntetos A (2012) The application of product-group seasonal indexes to individual products. Foresight Int J Appl Forecast 26:20–26

    Google Scholar 

  • Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6:525–533

    Google Scholar 

  • Montero-Manso P, Athanasopoulos G, Hyndman RJ, Talagala TS (2020) FFORMA: feature-based forecast model averaging. Int J Forecast 36:86–92

    Google Scholar 

  • Mukhopadhyay S, Solis AO, Gutierrez RS (2012) The accuracy of non-traditional versus traditional methods of forecasting lumpy demand. J Forecast 31:721–735

    Google Scholar 

  • Nasiri Pour AA, Rostami Tabar B, Rahimzadeh A (2008) A hybrid neural network and traditional approach for forecasting lumpy demand. World Academy of Science, Engineering and Technology, Paris

    Google Scholar 

  • Nguyen D, Widrow B (1990) Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. IJCNN Int Joint Conf Neural Netw 13:C21

    Google Scholar 

  • Nikolopoulos K, Petropoulos F (2018) Forecasting for big data: does suboptimality matter? Comput Oper Res 98:322–329

    Google Scholar 

  • Nikolopoulos K, Syntetos AA, Boylan JE, Petropoulos F, Assimakopoulos V (2011) An aggregate-disaggregate intermittent demand approach (ADIDA) to forecasting: an empirical proposition and analysis. J Oper Res Soc 62:544–554

    Google Scholar 

  • Nikolopoulos KI, Babai MZ, Bozos K (2016) Forecasting supply chain sporadic demand with nearest neighbor approaches. Int J Prod Econ 177:139–148

    Google Scholar 

  • Petropoulos F, Kourentzes N (2015) Forecast combinations for intermittent demand. J Oper Res Soc 66:914–924

    Google Scholar 

  • Petropoulos F, Nikolopoulos K, Spithourakis G, Assimakopoulos V (2013) Empirical heuristics for improving intermittent demand forecasting. Ind Manag Data Syst 113:683–696

    Google Scholar 

  • Petropoulos F, Makridakis S, Assimakopoulos V, Nikolopoulos K (2014) Horses for courses in demand forecasting. Eur J Oper Res 237:152–163

    Google Scholar 

  • Pooya A, Pakdaman M, Tadj L (2019) Exact and approximate solution for optimal inventory control of two-stock with reworking and forecasting of demand. Oper Res Int J 19:333–346

    Google Scholar 

  • Rao AV (1973) A comment on: Forecasting and stock control for intermittent demands. J Oper Res Soc 24:639–640

    Google Scholar 

  • Rasmussen CE, Williams C (2006) Gaussian processes for machine learning. The MIT Press, Cambridge

    Google Scholar 

  • Rodriguez PP, Gianola D (2018) brnn: Bayesian Regularization for Feed-Forward Neural Networks. R package version 7

  • Rostami-Tabar B, Babai MZ, Syntetos A, Ducq Y (2013) Demand forecasting by temporal aggregation. Naval Res Logist (NRL) 60:479–498

    Google Scholar 

  • Salinas D, Flunkert V, Gasthaus J, Januschowski T (2020) DeepAR: probabilistic forecasting with autoregressive recurrent networks. Int J Forecast 36:1181–1191

    Google Scholar 

  • Schölkopf B, Smola AJ (2001) Learning with kernel: support vector machines, regularization, optimization and beyond. The MIT Press, Cambridge

    Google Scholar 

  • Schwertman NC, Gilks AJ, Cameron J (1990) A simple noncalculus proof that the median minimizes the sum of the absolute deviations. Am Stat 44:38–39

    Google Scholar 

  • Seaman B (2018) Considerations of a retail forecasting practitioner. Int J Forecast 34:822–829

    Google Scholar 

  • Seeger MW, Salinas D, Flunkert V (2016) Bayesian intermittent demand forecasting for large inventories. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates Inc, Red Hook, pp 4646–4654

    Google Scholar 

  • Shale EA, Boylan JE, Johnston FR (2006) Forecasting for intermittent demand: the estimation of an unbiased average. J Oper Res Soc 57:588–592

    Google Scholar 

  • Smyl S (2020) A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. Int J Forecast 36:75–85

    Google Scholar 

  • Spiliotis E, Kouloumos A, Assimakopoulos V, Makridakis S (2020) Are forecasting competitions data representative of the reality? Int J Forecast 36:37–53

    Google Scholar 

  • Spithourakis GP, Petropoulos F, Babai MZ, Nikolopoulos K, Assimakopoulos V (2011) Improving the performance of popular supply chain forecasting techniques. Supply Chain Forum Int J 12:16–25

    Google Scholar 

  • Svetunkov I, Petropoulos F (2018) Old dog, new tricks: a modelling view of simple moving averages. Int J Prod Res 56:6034–6047

    Google Scholar 

  • Syntetos AA, Boylan JE (2005) The accuracy of intermittent demand estimates. Int J Forecast 21:303–314

    Google Scholar 

  • Syntetos AA, Boylan JE, Croston JD (2005) On the categorization of demand patterns. J Oper Res Soc 56:495–503

    Google Scholar 

  • Syntetos AA, Nikolopoulos K, Boylan JE (2010) Judging the judges through accuracy-implication metrics: the case of inventory forecasting. Int J Forecast 26:134–143

    Google Scholar 

  • Tashman LJ (2000) Out-of-sample tests of forecasting accuracy: an analysis and review. Int J Forecast 16:437–450

    Google Scholar 

  • Teunter RH, Duncan L (2009) Forecasting intermittent demand: a comparative study. J Oper Res Soc 60:321–329

    Google Scholar 

  • Teunter R, Syntetos A, Babai M (2010) Determining order-up-to levels under periodic review for compound binomial (intermittent) demand. Eur J Oper Res 203:619–624

    Google Scholar 

  • Teunter RH, Syntetos AA, Babai MZ (2011) Intermittent demand: linking forecasting to inventory obsolescence. Eur J Oper Res 214:606–615

    Google Scholar 

  • Willemain TR, Smart CN, Schwarz HF (2004) A new approach to forecasting intermittent demand for service parts inventories. Int J Forecast 20:375–387

    Google Scholar 

  • Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14:35–62

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evangelos Spiliotis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Spiliotis, E., Makridakis, S., Semenoglou, AA. et al. Comparison of statistical and machine learning methods for daily SKU demand forecasting. Oper Res Int J 22, 3037–3061 (2022). https://doi.org/10.1007/s12351-020-00605-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12351-020-00605-2

Keywords

Navigation