Skip to main content
Log in

Online sales prediction via trend alignment-based multitask recurrent neural networks

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

While business trends are constantly evolving, the timely prediction of sales volume offers precious information for companies to achieve a healthy balance between supply and demand. In practice, sales prediction is formulated as a time series prediction problem which aims to predict the future sales volume for different products with the observation of various influential factors (e.g. brand, season, discount, etc.) and corresponding historical sales records. To perform accurate sales prediction under the offline setting, we gain insights from the encoder–decoder recurrent neural network (RNN) structure and have proposed a novel framework named TADA (Chen et al., in: ICDM, 2018) to carry out trend alignment with dual-attention, multitask RNNs for sales prediction. However, the sales data accumulates at a fast rate and is updated on a regular basis, rendering it difficult for the trained model to maintain the prediction accuracy with new data. In this light, we further extend the model into TADA\(^+\), which is enhanced by an online learning module based on our innovative similarity-based reservoir. To construct the data reservoir for model retraining, different from most existing random sampling-based reservoir, our similarity-based reservoir selects data samples that are “hard” for the model to mine apparent dynamic patterns. The experimental results on two real-world datasets comprehensively show the superiority of TADA and TADA\(^+\) in both online and offline sales prediction tasks against other state-of-the-art competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/c/favorita-grocery-sales-forecasting/data.

  2. https://www.onestopwarehouse.com.au.

  3. https://developers.googleblog.com/2017/11/introducing-tensorflow-feature-columns.html.

References

  1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473

  2. Bao W, Yue J, Rao Y (2017) A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 12(7):e0180944

    Article  Google Scholar 

  3. Bengio S, Vinyals O, Jaitly N, Shazeer N (2015) Scheduled sampling for sequence prediction with recurrent neural networks. In: NIPS, pp 1171–1179

  4. Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken

    MATH  Google Scholar 

  5. Caballero Barajas KL, Akella R (2015) Dynamically modeling patient’s health state from electronic medical records: a time series approach. In: SIGKDD, pp 69–78

  6. Carbonneau R, Laframboise K, Vahidov R (2008) Application of machine learning techniques for supply chain demand forecasting. Eur J Oper Res 184(3):1140–1154

    Article  Google Scholar 

  7. Chen C, Yin H, Yao J, Cui B (2013) Terec: a temporal recommender system over tweet stream. VLDB Endow 6(12):1254–1257

    Article  Google Scholar 

  8. Chen H, Yin H, Chen T, Nguyen QVH, Peng WC, Li X (2019) Exploiting centrality information with graph convolutions for network representation learning. In: ICDE, pp 590–601

  9. Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: SIGKDD. ACM, pp 785–794

  10. Chen T, Yin H, Chen H, Wu L, Wang H, Zhou X, Li X (2018) Tada: trend alignment with dual-attention multi-task recurrent neural networks for sales prediction. In: ICDM, pp 49–58

  11. Chen T, Yin H, Chen H, Yan R, Nguyen QVH, Li X (2019) Air: attentional intention-aware recommender systems. In: ICDE, pp 304–315

  12. Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: SIGKDD. ACM, pp 493–498

  13. Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. arXiv:1409.1259

  14. Diaz-Aviles E, Drumond L, Schmidt-Thieme L, Nejdl W (2012) Real-time top-n recommendation in social streams. In: RecSys, pp 59–66

  15. Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: ACL, pp 1723–1732

  16. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Article  MathSciNet  Google Scholar 

  17. Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: ICML, pp 1764–1772

  18. Guo L, Yin H, Wang Q, Chen T, Zhou A, Hung NQV (2019) Streaming session-based recommendation. In: SIGKDD, pp 1569–1577

  19. Hamilton JD (1994) Time series analysis, vol 2. Princeton University Press, Princeton

    MATH  Google Scholar 

  20. Heigold G, Vanhoucke V, Senior A, Nguyen P, Ranzato M, Devin M, Dean J (2013) Multilingual acoustic models using distributed deep neural networks. In: ICASSP, pp 8619–8623

  21. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  22. Huang JT, Li J, Yu D, Deng L, Gong Y (2013) Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: ICASSP, pp 7304–7308

  23. Hung NQV, Duong CT, Tam NT, Weidlich M, Aberer K, Yin H, Zhou X (2017) Argument discovery via crowdsourcing. VLDB J 26(4):511–535

    Article  Google Scholar 

  24. Idé T, Kato S (2009) Travel-time prediction using Gaussian process regression: a trajectory-based approach. In: SDM. SDM, pp 1185–1196

  25. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  26. Kochak A, Sharma S (2015) Demand forecasting using neural network for supply chain management. Int J Mech Eng Robot Res 4(1):96–104

    Google Scholar 

  27. Kullback S (1997) Information theory and statistics. Courier Corporation, New York

    MATH  Google Scholar 

  28. Lai G, Chang WC, Yang Y, Liu H (2018) Modeling long-and short-term temporal patterns with deep neural networks. In: SIGIR, pp 95–104

  29. Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ (2018) Eegnet: a compact convolutional neural network for eeg-based brain-computer interfaces. J Neural Eng 15(5):056013

    Article  Google Scholar 

  30. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Article  Google Scholar 

  31. Liu X, Gao J, He X, Deng L, Duh K, Wang YY (2015) Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: NAACL, pp 912–921

  32. Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. arXiv:1511.06114

  33. Mehrotra R, Awadallah AH, Shokouhi M, Yilmaz E, Zitouni I, El Kholy A, Khabsa M (2017) Deep sequential models for task satisfaction prediction. In: CIKM. ACM, pp 737–746

  34. Nguyen TT, Duong CT, Weidlich M, Yin H, Nguyen QVH (2017) Retaining data from streams of social platforms with minimal regret. In: IJCAI, pp 2850–2856

  35. Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: VLDB, pp 697–708

  36. Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: EMNLP, pp 2249–2255

  37. Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell G (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: IJCAI, pp 2627–2633

  38. Ristanoski G, Liu W, Bailey J (2013) Time series forecasting using distribution enhanced linear regression. In: PAKDD, pp 484–495

  39. Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, Hoboken

    MATH  Google Scholar 

  40. Shokouhi M (2011) Detecting seasonal queries by time-series analysis. In: SIGIR. ACM, pp 1171–1172

  41. Sordoni A, Bengio Y, Vahabi H, Lioma C, Grue Simonsen J, Nie JY (2015) A hierarchical recurrent encoder–decoder for generative context-aware query suggestion. In: CIKM. ACM, pp 553–562

  42. Sun K, Qian T, Yin H, Chen T, Chen Y, Chen L (2019) What can history tell us? Identifying relevant sessions for next-item recommendation. In: CIKM

  43. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: NIPS, pp 3104–3112

  44. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: NIPS, pp 5998–6008

  45. Wang Q, Yin H, Hu Z, Lian D, Wang H, Huang Z (2018) Neural memory streaming recommender networks with adversarial training. In: SIGKDD, pp 2467–2475

  46. Wang W, Yin H, Huang Z, Wang Q, Du X, Nguyen QVH (2018) Streaming ranking based recommender systems. In: SIGIR, pp 525–534

  47. Wang Y, Yin H, Chen H, Wo T, Xu J, Zheng K (2019) Origin-destination matrix prediction via graph convolution: a new perspective of passenger demand modeling. In: SIGKDD, pp 1227–1235

  48. Wilson A, Adams R (2013) Gaussian process kernels for pattern discovery and extrapolation. In: ICML, pp 1067–1075

  49. Wu Y, Hernández-Lobato JM, Ghahramani Z (2013) Dynamic covariance models for multivariate financial time series. In: ICML, pp 558–566

  50. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: ICML, pp 2048–2057

  51. Yan W, Qiu H, Xue Y (2009) Gaussian process for long-term time-series forecasting. In: IJCNN. IEEE, pp 3420–3427

  52. Yao D, Zhang C, Huang J, Bi J (2017) Serm: a recurrent model for next location prediction in semantic trajectories. In: CIKM, pp 2411–2414

  53. Yin H, Chen H, Sun X, Wang H, Wang Y, Nguyen QVH (2017) Sptf: a scalable probabilistic tensor factorization model for semantic-aware behavior prediction. In: ICDM, pp 585–594

  54. Yin H, Cui B, Zhou X, Wang W, Huang Z, Sadiq S (2016) Joint modeling of user check-in behaviors for real-time point-of-interest recommendation. TOIS 35(2):11

    Article  Google Scholar 

  55. Zhang S, Yin H, Wang Q, Chen T, Chen H, Nguyen QVH (2019) Inferring substitutable products with deep network embedding. In: IJCAI-19, pp 4306–4312

  56. Zhang Y, Yang Q (2017) A survey on multi-task learning. arXiv:1707.08114

  57. Zheng X, Han J, Sun A (2018) A survey of location prediction on twitter. TKDE 30(9):1652–1671

    Google Scholar 

  58. Zhou J, Tung AK (2015) Smiler: a semi-lazy time series prediction system for sensors. In: SIGMOD. ACM, pp 1871–1886

Download references

Acknowledgements

This work is supported by Australian Research Council (Grant Nos. DP190101985, DP170103954), The University of Queensland (Grant No. 613134) and Natural Science Foundation of China (Grant No. 6167250).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhi Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, T., Yin, H., Chen, H. et al. Online sales prediction via trend alignment-based multitask recurrent neural networks. Knowl Inf Syst 62, 2139–2167 (2020). https://doi.org/10.1007/s10115-019-01404-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-019-01404-8

Keywords

Navigation