Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management

Ma, Cong; Zhang, Jiangshe; Li, Zongxin; Xu, Shuang

doi:10.1007/s00521-022-08011-9

Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management

Original Article
Published: 24 November 2022

Volume 35, pages 6589–6601, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Cong Ma ORCID: orcid.org/0000-0002-9367-5685¹,
Jiangshe Zhang²,
Zongxin Li¹ &
…
Shuang Xu³

1077 Accesses
14 Citations
Explore all metrics

Abstract

Financial portfolio management is reallocating the asset into financial products, whose goal is to maximize the profit under a certain risk. Since AlphaGo debated human professional players, deep reinforcement learning (DRL) algorithm has been widely used in various fields, including quantitative trading. The multi-agent system is a relatively new research branch in DRL, and its performance is better than that of a single agent in most cases. In this paper, we propose a novel multi-agent deep reinforcement learning algorithm with trend consistency regularization (TC-MARL) to find the optimal portfolio. Here, we divide the trend of stocks of one portfolio into two categories and train two different agents to learn the optimal trading strategy under these two stock trends. First, we build a trend consistency (TC) factor to recognize the consistency of several stocks from one portfolio. When the trend of these stocks is consistent, the factor is defined as 1; the trend is inconsistent, the factor is defined as $-$ 1. Based on it, a novel regularization related to the weights is proposed and added to the reward function, named TC regularization. And the TC factor value is used as the sign of the regularization term. In this way, two agents with different reward functions are constructed, which have the same policy model and value model. Afterward, the proposed TC-MARL algorithm will dynamically switch between the two trained agents to find the optimal portfolio strategy according to the market status. Extensive experimental results on the Chinese Stock Market show the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Reinforcement Learning Model for Stock Portfolio Management Based on Data Fusion

Article Open access 17 March 2024

Combining transformer based deep reinforcement learning with Black-Litterman model for portfolio optimization

Article 10 August 2024

Evaluation of Deep Reinforcement Learning Based Stock Trading

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availibility

All data are public. Users can download the related data from the JoinQuant website (https://www.joinquant.com) or other finance databases.

References

Markowitz HM (1959) Portfolio selection: efficient diversification of investments. John Wiley, New York
Google Scholar
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Article Google Scholar
Furuta R, Inoue N, Yamasaki T (2019) Pixelrl: fully convolutional network with reinforcement learning for image processing. IEEE Trans Multimed 22(7):1704–1719
Article Google Scholar
Gamrian S, Goldberg Y (2019) Transfer learning for related reinforcement learning tasks via image-to-image translation. In: International Conference on Machine Learning, pp 2063–2072 . PMLR
Pan B, Yang Y, Zhao Z, Zhuang Y, Cai D, He X (2018) Discourse marker augmented network with reinforcement learning for natural language inference. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 989–999
Zhong V, Xiong C, Socher R (2017) Seq2sql: generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103
Shi H, Lin Z, Zhang S, Li X, Hwang K-S (2018) An adaptive decision-making method with fuzzy Bayesian reinforcement learning for robot soccer. Inf Sci 436:268–281
Article MathSciNet Google Scholar
Johannink T, Bahl S, Nair A, Luo J, Kumar A, Loskyll M, Ojea JA, Solowjow E, Levine S (2019) Residual reinforcement learning for robot control. In: 2019 international conference on robotics and automation (ICRA), pp 6023–6029. IEEE
Ma C, Li Z, Lin D, Zhang J (2020) Parallel multi-environment shaping algorithm for complex multi-step task. Neurocomputing 402:323–335
Article Google Scholar
Zha D, Lai K.-H, Huang S, Cao Y, Reddy K, Vargas J, Nguyen A, Wei R, Guo J, Hu X (2020) RLCard: a platform for reinforcement learning in card games. In: IJCAI, pp 5264–5266
Liu Y, Liu Q, Zhao H, Pan Z, Liu C (2020) Adaptive quantitative trading: an imitative deep reinforcement learning approach. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 2128–2135
Lucarelli G, Borrotti M (2020) A deep Q-learning portfolio management framework for the cryptocurrency market. Neural Comput Appl 32(23):17229–17244
Article Google Scholar
Moody J, Wu L, Liao Y, Saffell M (1998) Performance functions and reinforcement learning for trading systems and portfolios. J Forecast 17(5–6):441–470
Article Google Scholar
Gao X, Chan L (2000) An algorithm for trading and portfolio management using q-learning and sharpe ratio maximization. In: Proceedings of the international conference on neural information processing, pp 832–837
Almahdi S, Yang SY (2017) An adaptive portfolio trading system: a risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Expert Syst Appl 87:267–279
Article Google Scholar
Jiang Z, Liang J (2017) Cryptocurrency portfolio management with deep reinforcement learning. In: 2017 intelligent systems conference (IntelliSys), pp 905–913 . IEEE
Jiang Z, Xu D, Liang J (2017) A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059
Liang Z, Chen H, Zhu J, Jiang K, Li Y (2018) Adversarial deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940
Almahdi S, Yang SY (2019) A constrained portfolio trading system using particle swarm algorithm and recurrent reinforcement learning. Expert Syst Appl 130:145–156
Article Google Scholar
Koratamaddi P, Wadhwani K, Gupta M, Sanjeevi DSG (2021) A multi-agent reinforcement learning approach for stock portfolio allocation. In: 8th ACM IKDD CODS and 26th COMAD, pp 410–410
Lee J, Kim R, Yi SW, Kang J (2020) Maps: multi-agent reinforcement learning-based portfolio management system. In: 29th international joint conference on artificial intelligence, IJCAI 2020, pp 4520–4526. International joint conferences on artificial intelligence
Lussange J, Lazarevich I, Bourgeois-Gironde S, Palminteri S, Gutkin B (2021) Modelling stock markets by multi-agent reinforcement learning. Comput Econ 57(1):113–147
Article Google Scholar
Huang Z, Tanaka F(2022) MSPM: A modularized and scalable multi-agent reinforcement learning-based system for financial portfolio management. Plos one 17(2): e0263689
Article Google Scholar
JoinQuant. https://www.joinquant.com
Huang D, Zhou J, Li B, Hoi SC, Zhou S (2016) Robust median reversion strategy for online portfolio selection. IEEE Trans Knowl Data Eng 28(9):2480–2493
Article Google Scholar
Li B, Hoi SC, Sahoo D, Liu Z-Y (2015) Moving average reversion strategy for on-line portfolio selection. Artif Intell 222:104–123
Article MathSciNet Google Scholar
Li B, Zhao P, Hoi SC, Gopalkrishnan V (2012) PAMR: passive aggressive mean reversion strategy for portfolio selection. Mach Learn 87(2):221–258
Article MathSciNet MATH Google Scholar
Shi S, Li J, Li G, Pan P (2019) A multi-scale temporal feature aggregation convolutional neural network for portfolio management. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 1613–1622
Lim QYE, Cao Q, Quek C (2022) Dynamic portfolio rebalancing through reinforcement learning. Neural Comput Appl 34(9):7125–7139
Article Google Scholar
Bansal G, Nushi B, Kamar E, Lasecki WS, Weld DS, Horvitz E (2019) Beyond accuracy: the role of mental models in human-AI team performance. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 7, pp 2–11

Download references

Acknowledgements

This work was supported in part by the Ministry of Education of Humanities and Social Science Project of China (No. 22XJCZH004), in part by the National Natural Science Foundation of China (Nos. 12201497, 61976174), in part by the Scientific Research Project of Shaanxi Provincial Department of Education (Nos. 22JK0186, 21JK0379), and in part by the Fundamental Ressearch Funds for the Central Universities (No. D5000220060).

Author information

Authors and Affiliations

School of Economics and Management, Northwest University, No. 1 Xuefu Road, Xi’an, 710127, People’s Republic of China
Cong Ma & Zongxin Li
School of Mathematics and Statistics, Xi’an Jiaotong University, No. 28 West Xianning Road, Xi’an, 710049, People’s Republic of China
Jiangshe Zhang
School of Mathematics and Statistics, Northwestern Polytechnical University, No. 127 West Youyi Road, Xi’an, 710072, People’s Republic of China
Shuang Xu

Authors

Cong Ma
View author publications
You can also search for this author inPubMed Google Scholar
Jiangshe Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Zongxin Li
View author publications
You can also search for this author inPubMed Google Scholar
Shuang Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Cong Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human or animal participants

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: The summary of notations

Here, we list all necessary notations used in this paper, as shown in Table 6.

Table 6 The list of key notations

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, C., Zhang, J., Li, Z. et al. Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management. Neural Comput & Applic 35, 6589–6601 (2023). https://doi.org/10.1007/s00521-022-08011-9

Download citation

Received: 11 April 2022
Accepted: 26 October 2022
Published: 24 November 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s00521-022-08011-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Reinforcement Learning Model for Stock Portfolio Management Based on Data Fusion

Combining transformer based deep reinforcement learning with Black-Litterman model for portfolio optimization

Evaluation of Deep Reinforcement Learning Based Stock Trading

Explore related subjects

Data availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human or animal participants

Additional information

Publisher's Note

Appendix A: The summary of notations

Appendix A: The summary of notations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now