Skip to main content

Advertisement

Log in

Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Financial portfolio management is reallocating the asset into financial products, whose goal is to maximize the profit under a certain risk. Since AlphaGo debated human professional players, deep reinforcement learning (DRL) algorithm has been widely used in various fields, including quantitative trading. The multi-agent system is a relatively new research branch in DRL, and its performance is better than that of a single agent in most cases. In this paper, we propose a novel multi-agent deep reinforcement learning algorithm with trend consistency regularization (TC-MARL) to find the optimal portfolio. Here, we divide the trend of stocks of one portfolio into two categories and train two different agents to learn the optimal trading strategy under these two stock trends. First, we build a trend consistency (TC) factor to recognize the consistency of several stocks from one portfolio. When the trend of these stocks is consistent, the factor is defined as 1; the trend is inconsistent, the factor is defined as \(-\) 1. Based on it, a novel regularization related to the weights is proposed and added to the reward function, named TC regularization. And the TC factor value is used as the sign of the regularization term. In this way, two agents with different reward functions are constructed, which have the same policy model and value model. Afterward, the proposed TC-MARL algorithm will dynamically switch between the two trained agents to find the optimal portfolio strategy according to the market status. Extensive experimental results on the Chinese Stock Market show the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availibility

All data are public. Users can download the related data from the JoinQuant website (https://www.joinquant.com) or other finance databases.

References

  1. Markowitz HM (1959) Portfolio selection: efficient diversification of investments. John Wiley, New York

    Google Scholar 

  2. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  3. Furuta R, Inoue N, Yamasaki T (2019) Pixelrl: fully convolutional network with reinforcement learning for image processing. IEEE Trans Multimed 22(7):1704–1719

    Article  Google Scholar 

  4. Gamrian S, Goldberg Y (2019) Transfer learning for related reinforcement learning tasks via image-to-image translation. In: International Conference on Machine Learning, pp 2063–2072 . PMLR

  5. Pan B, Yang Y, Zhao Z, Zhuang Y, Cai D, He X (2018) Discourse marker augmented network with reinforcement learning for natural language inference. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), pp 989–999

  6. Zhong V, Xiong C, Socher R (2017) Seq2sql: generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103

  7. Shi H, Lin Z, Zhang S, Li X, Hwang K-S (2018) An adaptive decision-making method with fuzzy Bayesian reinforcement learning for robot soccer. Inf Sci 436:268–281

    Article  MathSciNet  Google Scholar 

  8. Johannink T, Bahl S, Nair A, Luo J, Kumar A, Loskyll M, Ojea JA, Solowjow E, Levine S (2019) Residual reinforcement learning for robot control. In: 2019 international conference on robotics and automation (ICRA), pp 6023–6029. IEEE

  9. Ma C, Li Z, Lin D, Zhang J (2020) Parallel multi-environment shaping algorithm for complex multi-step task. Neurocomputing 402:323–335

    Article  Google Scholar 

  10. Zha D, Lai K.-H, Huang S, Cao Y, Reddy K, Vargas J, Nguyen A, Wei R, Guo J, Hu X (2020) RLCard: a platform for reinforcement learning in card games. In: IJCAI, pp 5264–5266

  11. Liu Y, Liu Q, Zhao H, Pan Z, Liu C (2020) Adaptive quantitative trading: an imitative deep reinforcement learning approach. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 2128–2135

  12. Lucarelli G, Borrotti M (2020) A deep Q-learning portfolio management framework for the cryptocurrency market. Neural Comput Appl 32(23):17229–17244

    Article  Google Scholar 

  13. Moody J, Wu L, Liao Y, Saffell M (1998) Performance functions and reinforcement learning for trading systems and portfolios. J Forecast 17(5–6):441–470

    Article  Google Scholar 

  14. Gao X, Chan L (2000) An algorithm for trading and portfolio management using q-learning and sharpe ratio maximization. In: Proceedings of the international conference on neural information processing, pp 832–837

  15. Almahdi S, Yang SY (2017) An adaptive portfolio trading system: a risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Expert Syst Appl 87:267–279

    Article  Google Scholar 

  16. Jiang Z, Liang J (2017) Cryptocurrency portfolio management with deep reinforcement learning. In: 2017 intelligent systems conference (IntelliSys), pp 905–913 . IEEE

  17. Jiang Z, Xu D, Liang J (2017) A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059

  18. Liang Z, Chen H, Zhu J, Jiang K, Li Y (2018) Adversarial deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940

  19. Almahdi S, Yang SY (2019) A constrained portfolio trading system using particle swarm algorithm and recurrent reinforcement learning. Expert Syst Appl 130:145–156

    Article  Google Scholar 

  20. Koratamaddi P, Wadhwani K, Gupta M, Sanjeevi DSG (2021) A multi-agent reinforcement learning approach for stock portfolio allocation. In: 8th ACM IKDD CODS and 26th COMAD, pp 410–410

  21. Lee J, Kim R, Yi SW, Kang J (2020) Maps: multi-agent reinforcement learning-based portfolio management system. In: 29th international joint conference on artificial intelligence, IJCAI 2020, pp 4520–4526. International joint conferences on artificial intelligence

  22. Lussange J, Lazarevich I, Bourgeois-Gironde S, Palminteri S, Gutkin B (2021) Modelling stock markets by multi-agent reinforcement learning. Comput Econ 57(1):113–147

    Article  Google Scholar 

  23. Huang Z, Tanaka F(2022) MSPM: A modularized and scalable multi-agent reinforcement learning-based system for financial portfolio management. Plos one 17(2): e0263689

    Article  Google Scholar 

  24. JoinQuant. https://www.joinquant.com

  25. Huang D, Zhou J, Li B, Hoi SC, Zhou S (2016) Robust median reversion strategy for online portfolio selection. IEEE Trans Knowl Data Eng 28(9):2480–2493

    Article  Google Scholar 

  26. Li B, Hoi SC, Sahoo D, Liu Z-Y (2015) Moving average reversion strategy for on-line portfolio selection. Artif Intell 222:104–123

    Article  MathSciNet  Google Scholar 

  27. Li B, Zhao P, Hoi SC, Gopalkrishnan V (2012) PAMR: passive aggressive mean reversion strategy for portfolio selection. Mach Learn 87(2):221–258

    Article  MathSciNet  MATH  Google Scholar 

  28. Shi S, Li J, Li G, Pan P (2019) A multi-scale temporal feature aggregation convolutional neural network for portfolio management. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 1613–1622

  29. Lim QYE, Cao Q, Quek C (2022) Dynamic portfolio rebalancing through reinforcement learning. Neural Comput Appl 34(9):7125–7139

    Article  Google Scholar 

  30. Bansal G, Nushi B, Kamar E, Lasecki WS, Weld DS, Horvitz E (2019) Beyond accuracy: the role of mental models in human-AI team performance. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 7, pp 2–11

Download references

Acknowledgements

This work was supported in part by the Ministry of Education of Humanities and Social Science Project of China (No. 22XJCZH004), in part by the National Natural Science Foundation of China (Nos. 12201497, 61976174), in part by the Scientific Research Project of Shaanxi Provincial Department of Education (Nos. 22JK0186, 21JK0379), and in part by the Fundamental Ressearch Funds for the Central Universities (No. D5000220060).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cong Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human or animal participants

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: The summary of notations

Appendix A: The summary of notations

Here, we list all necessary notations used in this paper, as shown in Table 6.

Table 6 The list of key notations

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, C., Zhang, J., Li, Z. et al. Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management. Neural Comput & Applic 35, 6589–6601 (2023). https://doi.org/10.1007/s00521-022-08011-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-08011-9

Keywords