Abstract
Stock movement prediction is one of the most challenging problems in time series analysis due to the stochastic nature of financial markets. In recent years, a plethora of statistical methods and machine learning algorithms were proposed for stock movement prediction. Specifically, deep learning models are increasingly applied for the prediction of stock movement. The success of deep learning models relies on the assumption that massive training data are available. However, this assumption is impractical for stock movement prediction. In stock markets, a large number of stocks do not have enough historical data, especially for the companies which underwent initial public offering in recent years. In these situations, the accuracy of deep learning models to predict the stock movement could be affected. To address this problem, in this paper, we propose novel instance-based deep transfer learning models with attention mechanism. In the experiments, we compare our proposed methods with state-of-the-art prediction models. Experimental results on three public datasets reveal that our proposed methods significantly improve the performance of deep learning models when limited training data are available.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ariyo AA, Adewumi AO, Ayo CK (2014) Stock price prediction using the arima model. In: UKSIm-AMSS 16th international conference on computer modelling and simulation, pp 106–112
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8. https://doi.org/10.1016/j.jocs.2010.12.007https://doi.org/10.1016/j.jocs.2010.12.007. https://www.sciencedirect.com/science/article/pii/S187775031100007X
Chen P, Tan Y (2021) Stock market movement prediction by gated hierarchical encoder. In: International conference on swarm intelligence, Springer, pp 511–521
Cruciani F, Sun C, Zhang S, Nugent C, Li C, Song S, Cheng C, Cleland I, Mccullagh P (2019) A public domain dataset for human activity recognition in free-living conditions. In: Smartworld, ubiquitous intelligence & computing, advanced & trusted computing, scalable computing & communications, cloud & big data computing, internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), IEEE, pp 166–171
Dau HA, Keogh E, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Yanping BH, Begum N, Bagnall A, Mueen A, Batista G (2018) Hexagon-ML: The ucr time series classification archive
Day O, Khoshgoftaar T (2017) A survey on heterogeneous transfer learning. J Big Data 4:29
Ding Q, Wu S, Sun H, Guo J, Guo J (2020) Hierarchical multi-scale gaussian transformer for stock movement prediction. In: IJCAI, pp 4640–4646
Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA (2018) Transfer learning for time series classification. In: International conference on big data (big data), IEEE, pp 1367–1376
Feng F, Chen H, He X, Ding J, Sun M, Chua TS (2019) Enhancing stock movement prediction with adversarial training. In: Proceedings of the twenty-tighth international joint conference on artificial intelligence, IJCAI-19, International Joint Conferences on Artificial Intelligence organization, pp 5843–5849
Gabriel AS (2012) Evaluating the forecasting performance of garch models. Procedia Soc Behav Sci 62:1006–1010. Evidence from Romania
Glasmachers T (2017) Limits of end-to-end learning. In: Asian Conference on Machine Learning, PMLR, pp 17–32
Graves A, Mohamed A, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: international conference on acoustics, speech and signal processing, IEEE, pp 6645–6649
He QQ, Pang PCI, Si YW (2019) Transfer learning for financial time series forecasting. In: Nayak AC, Sharma A (eds) PRICAI 2019: Trends in artificial intelligence, pp 24–36. Springer International Publishing, Cham
He QQ, Pang PCI, Si YW (2020) Multi-source transfer learning with ensemble for financial time series forecasting. In: International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE/WIC/ACM, pp 227–233
Hu Z, Liu W, Bian J, Liu X, Liu TY (2018) Listening to chaotic whispers: A deep learning framework for news-oriented stock trend prediction. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM ’18, Association for computing machinery, New York, NY, USA, pp 261–269
Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55(1):307–319. Support Vector Machines
Li W, Bao R, Harimoto K, Chen D, Xu J, Su Q (2020) Modeling the stock relation with graph network for overnight stock movement prediction. In: Bessiere C (ed) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20, pp 4541–4547. in FinTech. International Joint Conferences on Artificial Intelligence Organization, Special Track on AI
Liu S, Johns E, Davison AJ (2019) End-to-end multi-task learning with attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1871–1880
Lo AW, MacKinlay AC (1990) When are contrarian profits due to stock market overreaction? Rev Financ Stud 3(2):175–205. http://www.jstor.org/stable/2962020
Mech TS (1993) Portfolio return autocorrelation. J Financ Econ 34(3):307–344
Moon S, Carbonell J (2017) Completely heterogeneous transfer learning with attention - what and what not to transfer. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp 2508–2514. https://doi.org/10.24963/ijcai.2017/349
Nguyen TH, Shirai K, Velcin J (2015) Sentiment analysis on social media for stock movement prediction. Expert Syst Appl 42(24):9603–9611
Nguyen TT, Yoon S (2019) A novel approach to short-term stock price movement prediction using transfer learning. Appl Sci 9(22):4745
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library. In: Wallach H., Larochelle H., Beygelzimer A., d’Alché-Buc F, fox E, Garnett R (eds) Advances in Neural Information Processing Systems, vol 32. pp 8024–8035. Inc, Curran Associates
Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell GW (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17, AAAI Press, pp 2627–2633
Rosenstein MT, Marx Z, Kaelbling LP, Dietterich TG (2005) To transfer or not to transfer. In: NIPS 2005 workshop on transfer learning, vol 898. pp 1–4
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) Artificial Neural Networks and Machine Learning – ICANN 2018. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds). Springer International Publishing, Cham, pp 270–279
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Curran associates inc, pp 6000–6010
Walczak S (2001) An empirical analysis of data requirements for financial forecasting with neural networks. J Manag Inf Syst 17(4):203–222. http://www.jstor.org/stable/40398510
Wilson G, Doppa JR, Cook DJ (2020) Multi-source deep domain adaptation with weak supervision for time-series sensor data. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 1768–1778
Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158
Xu H, Chai L, Luo Z, Li S (2022) Stock movement prediction via gated recurrent unit network based on reinforcement learning with incorporated attention mechanisms. Neurocomputing 467:214–228
Xu Y, Cohen SB (2018) Stock movement prediction from tweets and historical prices. In: Proceedings of the 56th annual meeting of the association for computational linguistics, (vol 1. Long Papers), pp 1970–1979. Association for Computational Linguistics, Melbourne
Ye R, Dai Q (2018) A novel transfer learning framework for time series forecasting. Knowl-Based Syst 156:74–99
Ye R, Dai Q (2021) Implementing transfer learning across different datasets for time series forecasting. Pattern Recogn 109:107617
Yoo J, Soun Y, Park YC, Kang U (2021) Accurate multivariate stock movement prediction via data-axis transformer with multi-level contexts. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp 2037–2045
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol 2. NIPS’14, pp 3320–3328, MIT Press, Cambridge, MA, USA
Zhang L, Aggarwal C, Qi GJ (2017) Stock price prediction via discovering multi-frequency trading patterns. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 2141–2149
Acknowledgements
This research was funded by University of Macau (File no. MYRG2019-00136-FST).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: ACL Dataset
Appendix B: KDD Dataset
Rights and permissions
About this article
Cite this article
He, QQ., Siu, S.W.I. & Si, YW. Instance-based deep transfer learning with attention for stock movement prediction. Appl Intell 53, 6887–6908 (2023). https://doi.org/10.1007/s10489-022-03755-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03755-2