Skip to main content
Log in

Continual portfolio selection in dynamic environments via incremental reinforcement learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Portfolio selection, as an important topic in the finance community, has attracted increased attention from artificial intelligence practitioners. Recently, the reinforcement learning (RL) paradigm, with the self-learning and model-free property, provides a promising candidate to solve complex portfolio selection tasks. Traditional research on RL-based portfolio selection focuses on batch-mode stationary problems, where all the market data is assumed to be available for the one-time training process. However, the real-world financial markets are often dynamic where the streaming data increments with new patterns keep emerging continually. In the paper, we address the continual portfolio selection problem in such dynamic environments. We propose to utilize the incremental RL approach with a two-step solution for efficiently adjusting the existing portfolio policy to a new one when the market changes as a new data increment comes. The first step, policy relaxation, forces the agent to execute a relaxed policy for encouraging a sufficient exploration in the new market. The second step, importance weighting, puts emphasis on learning samples consisting of more new information for stimulating the existing portfolio policy to more rapidly adapt to the new market. Evaluation results on real-world portfolio tasks verify the effectiveness and superiority of our method for addressing the continual portfolio selection in dynamic environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/camnugent/sandp500.

  2. Poloniex’s official API: https://poloniex.com/support/api/.

References

  1. Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91

    Google Scholar 

  2. Kelly J Jr (1956) A new interpretation of information rate. Bell Syst Tech J 35(4):917–926

    Article  Google Scholar 

  3. Wang X, Wang B, Liu S, Li H, Wang T, Watada J (2021) Fuzzy portfolio selection based on three-way decision and cumulative prospect theory. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-021-01402-9

    Article  Google Scholar 

  4. Hu S, Li F, Liu Y, Wang S (2020) A self-adaptive preference model based on dynamic feature analysis for interactive portfolio optimization. Int J Mach Learn Cybern 11(6):1253–1266

    Article  Google Scholar 

  5. Mohagheghi V, Mousavi SM (2021) A new multi-period optimization model for resilient-sustainable project portfolio evaluation under interval-valued pythagorean fuzzy sets with a case study. Int J Mach Learn Cybern 12(12):3541–3560

    Article  Google Scholar 

  6. Shen W, Wang J (2015) Transaction costs-aware portfolio optimization via fast Lowner–John ellipsoid approximation. In: Proceedings of AAAI conference on artificial intelligence, Austin, Texas, USA, vol 29

  7. Agarwal A, Hazan E, Kale S, Schapire RE (2006) Algorithms for portfolio management based on the Newton method. In: Proceedings of international conference on machine learning, Pittsburgh, Pennsylvania, USA, pp 9–16

  8. Huang D, Zhou J, Li B, Hoi SC, Zhou S (2016) Robust median reversion strategy for online portfolio selection. IEEE Trans Knowl Data Eng 28(9):2480–2493

    Article  Google Scholar 

  9. Khedmati M, Azin P (2020) An online portfolio selection algorithm using clustering approaches and considering transaction costs. Expert Syst Appl 159(113):546

    Google Scholar 

  10. Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybern 11(4):747–750

    Article  Google Scholar 

  11. Li J, Huang C, Qi J, Qian Y, Liu W (2017) Three-way cognitive concept learning via multi-granularity. Inf Sci 378:244–263

    Article  MATH  Google Scholar 

  12. Yu D, Xu Z, Wang X (2020) Bibliometric analysis of support vector machines research trend: a case study in china. Int J Mach Learn Cybern 11(3):715–728

    Article  Google Scholar 

  13. Zhi H, Li J (2018) Influence of dynamical changes on concept lattice and implication rules. Int J Mach Learn Cybern 9(5):795–805

    Article  Google Scholar 

  14. Wang XZ, Wang R, Xu C (2017) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48(2):703–715

    Article  Google Scholar 

  15. Tang Y, Fan M, Li J (2016) An information fusion technology for triadic decision contexts. Int J Mach Learn Cybern 7(1):13–24

    Article  Google Scholar 

  16. Niaki STA, Hoseinzade S (2013) Forecasting S &P 500 index using artificial neural networks and design of experiments. J Ind Eng Int 9(1):1–9

    Article  Google Scholar 

  17. Das P, Johnson N, Banerjee A (2014) Online portfolio selection with group sparsity. In: Proceedings of AAAI conference on artificial intelligence, Québec City, Québec, Canada, vol 28

  18. Heaton JB, Polson NG, Witte JH (2017) Deep learning for finance: deep portfolios. Appl Stoch Models Bus Ind 33(1):3–12

    Article  MATH  Google Scholar 

  19. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. MIT press, Cambridge

    MATH  Google Scholar 

  20. Jiang Z, Xu D, Liang J (2017) A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059

  21. Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  22. Vinyals O, Babuschkin I, Czarnecki WM et al (2019) Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782):350–354

    Article  Google Scholar 

  23. Silver D, Schrittwieser J, Simonyan K et al (2017) Mastering the game of Go without human knowledge. Nature 550(7676):354–359

    Article  Google Scholar 

  24. Hwangbo J, Lee J, Dosovitskiy A, Bellicoso D, Tsounis V, Koltun V, Hutter M (2019) Learning agile and dynamic motor skills for legged robots. Sci Robot 4(26):eaau5872

    Article  Google Scholar 

  25. Li JA, Dong D, Wei Z, Liu Y, Pan Y, Nori F, Zhang X (2020) Quantum reinforcement learning during human decision-making. Nat Hum Behav 4:294–307

    Article  Google Scholar 

  26. Zhang Y, Zhao P, Li B, Wu Q, Huang J, Tan M (2020) Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE20202979700

    Article  Google Scholar 

  27. Xu K, Zhang Y, Ye D, Zhao P, Tan M (2020) Relation-aware transformer for portfolio policy learning. In: Proceedings of international joint conference on artificial intelligence, Yokohama, Japan, pp 4647–4653

  28. Park H, Sim MK, Choi DG (2020) An intelligent financial portfolio trading strategy using deep q-learning. Expert Syst Appl 158(113):573

    Google Scholar 

  29. Liang Q, Zhu M, Zheng X, Wang Y (2021) An adaptive news-driven method for CVaR-sensitive online portfolio selection in non-stationary financial markets. In: Proceedings of AAAI conference on artificial intelligence, Vancouver, British Columbia, Canada, pp 2708–2715

  30. He H, Chen S, Li K, Xu X (2011) Incremental learning from stream data. IEEE Trans Neural Netw 22(12):1901–1914

    Article  Google Scholar 

  31. Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2019) Learning under concept drift: a review. IEEE Trans Knowl Data Eng 31(12):2346–2363

    Google Scholar 

  32. Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531

    Article  Google Scholar 

  33. Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern Part C (Appl Rev) 31(4):497–508

    Article  Google Scholar 

  34. Carpenter GA, Grossberg S (1988) The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21(3):77–88

    Article  Google Scholar 

  35. Yu H, Lu J, Zhang G (2020) Online topology learning by a Gaussian membership-based self-organizing incremental neural network. IEEE Trans Neural Netw Learn Syst 31(10):3947–3961

    Article  Google Scholar 

  36. Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1–3):125–141

    Article  Google Scholar 

  37. Yang S, Yao X (2008) Population-based incremental learning with associative memory for dynamic environments. IEEE Trans Evol Comput 12(5):542–561

    Article  Google Scholar 

  38. Kulić D, Ott C, Lee D, Ishikawa J, Nakamura Y (2012) Incremental learning of full body motion primitives and their sequencing through human motion observation. Int J Robot Res 31(3):330–345

    Article  Google Scholar 

  39. Wang Z, Li HX (2019) Incremental spatiotemporal learning for online modeling of distributed parameter systems. IEEE Trans Syst Man Cybern Syst 49(12):2612–2622

    Article  Google Scholar 

  40. Pratama M, Lu J, Lughofer E, Zhang G, Er MJ (2017) An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Trans Fuzzy Syst 25(5):1175–1192

    Article  Google Scholar 

  41. Pratama M, Lu J, Anavatti S, Lughofer E, Lim CP (2016) An incremental meta-cognitive-based scaffolding fuzzy neural network. Neurocomputing 171:89–105

    Article  Google Scholar 

  42. Wang Z, Chen C, Li HX, Dong D, Tarn TJ (2019) Incremental reinforcement learning with prioritized sweeping for dynamic environments. IEEE/ASME Trans Mechatron 24(2):621–632

    Article  Google Scholar 

  43. Wang Z, Li HX, Chen C (2020) Incremental reinforcement learning in continuous spaces via policy relaxation and importance weighting. IEEE Trans Neural Netw Learn Syst 31(6):1870–1883

    Article  Google Scholar 

  44. Wang Z, Chen C, Dong D (2021) Lifelong incremental reinforcement learning with online Bayesian inference. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS20213055499

    Article  Google Scholar 

  45. Li B, Hoi SC (2014) Online portfolio selection: a survey. ACM Comput Surv 46(3):1–36

    MATH  Google Scholar 

  46. Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of advances in neural information processing systems, Denver, USA, pp 1057–1063

  47. Martens J (2010) Deep learning via Hessian-free optimization. Int Conf Mach Learn 27:735–742

    Google Scholar 

  48. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of international conference on machine learning, Sydney, NSW, Australia, pp 1126–1135

  49. Rakelly K, Zhou A, Finn C, Levine S, Quillen D (2019) Efficient off-policy meta-reinforcement learning via probabilistic context variables. In: Proceedings of international conference on machine learning, Long Beach, California, USA, pp 5331–5340

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant no. 62006111, and the Natural Science Foundation of Jiangsu Province of China under Grant BK20200330.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, S., Wang, B., Li, H. et al. Continual portfolio selection in dynamic environments via incremental reinforcement learning. Int. J. Mach. Learn. & Cyber. 14, 269–279 (2023). https://doi.org/10.1007/s13042-022-01639-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01639-y

Keywords

Navigation