Abstract
Stock trading is one of economically research hotspots. In the past decades, many researchers used machine learning methods to simply predict the short-term price of stocks or long-term trend of stocks. However, only by comprehensive consideration of these two we can better reduce the risk of stock trading. This paper models stock trading as an incomplete information game, and proposes a deep reinforcement learning framework for training trading agents. In order to make well use of the temporal relation of stock data, we select the most advanced Temporal Convolutional Network and Transformer network as the policy network in deep reinforcement learning, and use TRPO and PPO for policy optimization. We propose a reward function that integrates short-term stock price prediction and long-term stock trend prediction with controllable risks to compute the utility of the agent action, which allows the agent to learn low risk trading strategies. The trading experiment in the standard & poor 500 ETF (S &P500 index) validates the proposed deep reinforcement learning method, and the experimental results show that the strategies by the proposed method in economic indicators (Maximum drawdown, Sharpe Ratio, Return Curve) are better than the S &P500 ETF baseline strategy.
Supported by National Key Research and Development Project under Grant 2018AAA01008-02.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhang, Y., et al.: Stock market prediction of S &P 500 via combination of improved BCO approach and BP neural network. Expert Syst. Appl. 36, 8849–8854 (2009)
Man Chon, U., Rasheed, K.: A relative tendency based stock market prediction system. In: 2010 Ninth International Conference on Machine Learning and Applications, Washington, pp. 949–953 (2010)
Rapach, D.E., Strauss, J.K., Zhou, G.: International stock return predictability: what is the role of the United States. J. Finance 46 (2012)
Graves, A.: Sequence transduction with recurrent neural networks. In: International Conference of Machine Learning (ICML) (2012)
Iqbal, Z., et al.: Efficient machine learning techniques for stock market prediction. Engineering Research and Applications (2013)
Chen, K., et al.: A LSTM-based method for stock returns prediction: a case study of China stock market. In: IEEE International Conference on Big Data IEEE (2015)
Murekachiro, D.: A review of artificial neural networks application to stock market predictions. Network and Complex Systems (2016)
Akita, R., Yoshihara, A., Matsubara, T., Uehara, K.: Deep learning for stock prediction using numerical and textual information. In: International Conference on Computer and Information Science (ICIS) (2016)
Burch, N.: Time and space: why imperfect information games are hard. Ph.D. thesis, University of Alberta (2017)
Vaswani, A., et al.: Attention is all you need. In: Neural Information Processing Systems (NIPS) (2017)
Bai, S., et al.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018)
Li, X., Li, Y., Zhan, Y., Liu, X.-Y.: Optimistic bull or pessimistic bear: adaptive deep reinforcement learning for stock portfolio allocation. arXiv preprint arXiv:1907.01503 (2019)
Meng, T.L., Khushi, M.: Reinforcement learning in financial markets. Data 4(3), 110 (2019). https://doi.org/10.3390/data4030110
Li, Y., Ni, P., Chang, V.: Application of deep reinforcement learning in stock trading strategies and stock forecasting. Computing 102, 1305–1322 (2020)
Yuan, Y., Wen, W., Yang, J.: Using data augmentation based reinforcement learning for daily stock trading. Electronics 9(9), 1384 (2020). https://doi.org/10.3390/electronics9091384
Wu, X., Chen, H., Wang, J., Troiano, L., et al.: Adaptive stock trading strategies with deep reinforcement learning methods. Inf. Sci. 538, 142–158 (2020)
National University of Singapore, Singapore, Trung Hieu, L.: Deep reinforcement learning for stock portfolio optimization. IJMO 10(5), 139–144 (2020). https://doi.org/10.7763/IJMO.2020.V10.761
Badr, H., Ouhbi, B., Frikh, B.: Rules based policy for stock trading: a new deep reinforcement learning method. In: 2020 5th International Conference on Cloud Computing and Artificial Intelligence (2020)
Liu, X.-Y., et al.: FinRL: a deep reinforcement learning library for automated stock trading in quantitative finance. arXiv preprint arXiv:2011.09607 (2020)
Carta, S., et al.: A multi-layer and multi-ensemble stock trader using deep learning and deep reinforcement learning. Appl. Intell. 51, 889–905 (2021)
Carta, S., et al.: Multi-DQN: an ensemble of Deep Q-learning agents for stock market forecasting. Expert Syst. Appl. 164, 113820 (2021)
Anish, C.M., Majhi, B.: Hybrid nonlinear adaptive scheme for stock market prediction using feedback FLANN and factor analysis. J. Korean Stat. Soc. 45, 64–76 (2016)
Schulman, J., Levine, S., Moritz, P., Jordan, M.I., Abbeel, P., Wang, J.: Trust region policy optimization. In: International Conference on Machine Learning (2015)
Wang, Y., He, H., Tan, X., Gan, Y.: Trust region-guided proximal policy optimization. In: Conference and Workshop on Neural Information Processing (2019)
Azhikodan, A.R., Bhat, A.G.K., Jadhav, M.V.: Stock trading bot using deep reinforcement learning. In: Innovations in Computer Science and Engineering (2019)
Xu, Y., Yang, C., Peng, S., Nojima, Y.: A hybrid two-stage financial stock forecasting algorithm based on clustering and ensemble learning. Appl. Intell. 50(11), 3852–3867 (2020). https://doi.org/10.1007/s10489-020-01766-5
Li, M., Chen, L., Zhao, J., Li, Q.: Sentiment analysis of Chinese stock reviews based on BERT model. Appl. Intell. 51(7), 5016–5024 (2021). https://doi.org/10.1007/s10489-020-02101-8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, Q., Qu, T., Han, Y., Duan, F. (2023). Deep Reinforcement Learning with Comprehensive Reward for Stock Trading. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1794. Springer, Singapore. https://doi.org/10.1007/978-981-99-1648-1_44
Download citation
DOI: https://doi.org/10.1007/978-981-99-1648-1_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1647-4
Online ISBN: 978-981-99-1648-1
eBook Packages: Computer ScienceComputer Science (R0)