Abstract
While business trends are constantly evolving, the timely prediction of sales volume offers precious information for companies to achieve a healthy balance between supply and demand. In practice, sales prediction is formulated as a time series prediction problem which aims to predict the future sales volume for different products with the observation of various influential factors (e.g. brand, season, discount, etc.) and corresponding historical sales records. To perform accurate sales prediction under the offline setting, we gain insights from the encoder–decoder recurrent neural network (RNN) structure and have proposed a novel framework named TADA (Chen et al., in: ICDM, 2018) to carry out trend alignment with dual-attention, multitask RNNs for sales prediction. However, the sales data accumulates at a fast rate and is updated on a regular basis, rendering it difficult for the trained model to maintain the prediction accuracy with new data. In this light, we further extend the model into TADA\(^+\), which is enhanced by an online learning module based on our innovative similarity-based reservoir. To construct the data reservoir for model retraining, different from most existing random sampling-based reservoir, our similarity-based reservoir selects data samples that are “hard” for the model to mine apparent dynamic patterns. The experimental results on two real-world datasets comprehensively show the superiority of TADA and TADA\(^+\) in both online and offline sales prediction tasks against other state-of-the-art competitors.
Similar content being viewed by others
References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Bao W, Yue J, Rao Y (2017) A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 12(7):e0180944
Bengio S, Vinyals O, Jaitly N, Shazeer N (2015) Scheduled sampling for sequence prediction with recurrent neural networks. In: NIPS, pp 1171–1179
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken
Caballero Barajas KL, Akella R (2015) Dynamically modeling patient’s health state from electronic medical records: a time series approach. In: SIGKDD, pp 69–78
Carbonneau R, Laframboise K, Vahidov R (2008) Application of machine learning techniques for supply chain demand forecasting. Eur J Oper Res 184(3):1140–1154
Chen C, Yin H, Yao J, Cui B (2013) Terec: a temporal recommender system over tweet stream. VLDB Endow 6(12):1254–1257
Chen H, Yin H, Chen T, Nguyen QVH, Peng WC, Li X (2019) Exploiting centrality information with graph convolutions for network representation learning. In: ICDE, pp 590–601
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: SIGKDD. ACM, pp 785–794
Chen T, Yin H, Chen H, Wu L, Wang H, Zhou X, Li X (2018) Tada: trend alignment with dual-attention multi-task recurrent neural networks for sales prediction. In: ICDM, pp 49–58
Chen T, Yin H, Chen H, Yan R, Nguyen QVH, Li X (2019) Air: attentional intention-aware recommender systems. In: ICDE, pp 304–315
Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: SIGKDD. ACM, pp 493–498
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. arXiv:1409.1259
Diaz-Aviles E, Drumond L, Schmidt-Thieme L, Nejdl W (2012) Real-time top-n recommendation in social streams. In: RecSys, pp 59–66
Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: ACL, pp 1723–1732
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: ICML, pp 1764–1772
Guo L, Yin H, Wang Q, Chen T, Zhou A, Hung NQV (2019) Streaming session-based recommendation. In: SIGKDD, pp 1569–1577
Hamilton JD (1994) Time series analysis, vol 2. Princeton University Press, Princeton
Heigold G, Vanhoucke V, Senior A, Nguyen P, Ranzato M, Devin M, Dean J (2013) Multilingual acoustic models using distributed deep neural networks. In: ICASSP, pp 8619–8623
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Huang JT, Li J, Yu D, Deng L, Gong Y (2013) Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: ICASSP, pp 7304–7308
Hung NQV, Duong CT, Tam NT, Weidlich M, Aberer K, Yin H, Zhou X (2017) Argument discovery via crowdsourcing. VLDB J 26(4):511–535
Idé T, Kato S (2009) Travel-time prediction using Gaussian process regression: a trajectory-based approach. In: SDM. SDM, pp 1185–1196
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Kochak A, Sharma S (2015) Demand forecasting using neural network for supply chain management. Int J Mech Eng Robot Res 4(1):96–104
Kullback S (1997) Information theory and statistics. Courier Corporation, New York
Lai G, Chang WC, Yang Y, Liu H (2018) Modeling long-and short-term temporal patterns with deep neural networks. In: SIGIR, pp 95–104
Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ (2018) Eegnet: a compact convolutional neural network for eeg-based brain-computer interfaces. J Neural Eng 15(5):056013
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Liu X, Gao J, He X, Deng L, Duh K, Wang YY (2015) Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: NAACL, pp 912–921
Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. arXiv:1511.06114
Mehrotra R, Awadallah AH, Shokouhi M, Yilmaz E, Zitouni I, El Kholy A, Khabsa M (2017) Deep sequential models for task satisfaction prediction. In: CIKM. ACM, pp 737–746
Nguyen TT, Duong CT, Weidlich M, Yin H, Nguyen QVH (2017) Retaining data from streams of social platforms with minimal regret. In: IJCAI, pp 2850–2856
Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: VLDB, pp 697–708
Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: EMNLP, pp 2249–2255
Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell G (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: IJCAI, pp 2627–2633
Ristanoski G, Liu W, Bailey J (2013) Time series forecasting using distribution enhanced linear regression. In: PAKDD, pp 484–495
Rousseeuw PJ, Leroy AM (2005) Robust regression and outlier detection, vol 589. Wiley, Hoboken
Shokouhi M (2011) Detecting seasonal queries by time-series analysis. In: SIGIR. ACM, pp 1171–1172
Sordoni A, Bengio Y, Vahabi H, Lioma C, Grue Simonsen J, Nie JY (2015) A hierarchical recurrent encoder–decoder for generative context-aware query suggestion. In: CIKM. ACM, pp 553–562
Sun K, Qian T, Yin H, Chen T, Chen Y, Chen L (2019) What can history tell us? Identifying relevant sessions for next-item recommendation. In: CIKM
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: NIPS, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: NIPS, pp 5998–6008
Wang Q, Yin H, Hu Z, Lian D, Wang H, Huang Z (2018) Neural memory streaming recommender networks with adversarial training. In: SIGKDD, pp 2467–2475
Wang W, Yin H, Huang Z, Wang Q, Du X, Nguyen QVH (2018) Streaming ranking based recommender systems. In: SIGIR, pp 525–534
Wang Y, Yin H, Chen H, Wo T, Xu J, Zheng K (2019) Origin-destination matrix prediction via graph convolution: a new perspective of passenger demand modeling. In: SIGKDD, pp 1227–1235
Wilson A, Adams R (2013) Gaussian process kernels for pattern discovery and extrapolation. In: ICML, pp 1067–1075
Wu Y, Hernández-Lobato JM, Ghahramani Z (2013) Dynamic covariance models for multivariate financial time series. In: ICML, pp 558–566
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: ICML, pp 2048–2057
Yan W, Qiu H, Xue Y (2009) Gaussian process for long-term time-series forecasting. In: IJCNN. IEEE, pp 3420–3427
Yao D, Zhang C, Huang J, Bi J (2017) Serm: a recurrent model for next location prediction in semantic trajectories. In: CIKM, pp 2411–2414
Yin H, Chen H, Sun X, Wang H, Wang Y, Nguyen QVH (2017) Sptf: a scalable probabilistic tensor factorization model for semantic-aware behavior prediction. In: ICDM, pp 585–594
Yin H, Cui B, Zhou X, Wang W, Huang Z, Sadiq S (2016) Joint modeling of user check-in behaviors for real-time point-of-interest recommendation. TOIS 35(2):11
Zhang S, Yin H, Wang Q, Chen T, Chen H, Nguyen QVH (2019) Inferring substitutable products with deep network embedding. In: IJCAI-19, pp 4306–4312
Zhang Y, Yang Q (2017) A survey on multi-task learning. arXiv:1707.08114
Zheng X, Han J, Sun A (2018) A survey of location prediction on twitter. TKDE 30(9):1652–1671
Zhou J, Tung AK (2015) Smiler: a semi-lazy time series prediction system for sensors. In: SIGMOD. ACM, pp 1871–1886
Acknowledgements
This work is supported by Australian Research Council (Grant Nos. DP190101985, DP170103954), The University of Queensland (Grant No. 613134) and Natural Science Foundation of China (Grant No. 6167250).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, T., Yin, H., Chen, H. et al. Online sales prediction via trend alignment-based multitask recurrent neural networks. Knowl Inf Syst 62, 2139–2167 (2020). https://doi.org/10.1007/s10115-019-01404-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-019-01404-8