Skip to main content

Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance Company

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12978))

Abstract

The asset management of an insurance company is more complex than traditional portfolio management due to the presence of obligations that the insurance company must fulfill toward the clients. These obligations, commonly referred to as liabilities, are payments whose magnitude and occurrence are a byproduct of insurance contracts with the clients, and of portfolio performances.

In particular, while clients must be refunded in case of adverse events, such as car accidents or death, they also contribute to a common financial portfolio to earn annual returns. Customer withdrawals might increase whenever these returns are too low or, in the presence of an annual minimum guaranteed, the company might have to integrate the difference. Hence, in this context, any investment strategy cannot omit the inter-dependency between financial assets and liabilities.

To deal with this problem, we present a stochastic model that combines portfolio returns with the liabilities generated by the insurance products offered by the company. Furthermore, we propose a risk-adjusted optimization problem to maximize the capital of the company over a pre-determined time horizon.

Since traditional financial tools are inadequate for such a setting, we develop the model as a Markov Decision Process. In this way, we can use Reinforcement Learning algorithms to solve the underlying optimization problem. Finally, we provide experiments that show how the optimal asset allocation can be found by training an agent with the algorithm Deep Deterministic Policy Gradient.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The other three assets are omitted as they go to zero very quickly.

References

  1. Black, F., Litterman, R.: Global portfolio optimization. Financ. Anal. J. 48(5), 28–43 (1992)

    Article  Google Scholar 

  2. Buhler, H., Gonon, L., Teichmann, J., Wood, B.: Deep hedging (2018)

    Google Scholar 

  3. Cox, J.C., Ingersoll, J.E., Ross, S.A.: A theory of the term structure of interest rates. Econometrica 53(2), 385–407 (1985). ISSN 00129682, 14680262

    Google Scholar 

  4. De Asis, K., Chan, A., Pitis, S., Sutton, R.S., Graves, D.: Fixed-horizon temporal difference methods for stable reinforcement learning. arXiv preprint arXiv:1909.03906 (2019)

  5. Denardo, E.V.: On linear programming in a Markov decision problem. Manag. Sci. 16(5), 281–288 (1970)

    Article  MathSciNet  Google Scholar 

  6. Doob, J.L.: The Brownian movement and stochastic equations. Ann. Math. 351–369 (1942)

    Google Scholar 

  7. Fontoura, A., Haddad, D., Bezerra, E.: A deep reinforcement learning approach to asset-liability management. In: 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), pp. 216–221. IEEE (2019)

    Google Scholar 

  8. Halperin, I.: QLBS: Q-learner in the Black-Scholes(-Merton) worlds. arXiv preprint arXiv:1712.04609 (2017)

  9. Jangmin, O., Lee, J., Lee, J.W., Zhang, B.T.: Adaptive stock trading with dynamic asset allocation using reinforcement learning. Inf. Sci. 176(15), 2121–2147 (2006)

    Article  Google Scholar 

  10. Jiang, N., Kulesza, A., Singh, S., Lewis, R.: The dependence of effective planning horizon on model accuracy. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015, Richland, SC, pp. 1181–1189. International Foundation for Autonomous Agents and Multiagent Systems (2015). ISBN 9781450334136

    Google Scholar 

  11. Krabichler, T., Teichmann, J.: Deep replication of a runoff portfolio (2020)

    Google Scholar 

  12. Leibowitz, M., Fabozzi, F.J., Sharpe, W.: Investing: The Collected Works of Martin L. Leibowitz. Probus Professional Pub (1992)

    Google Scholar 

  13. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(39), 1–40 (2016)

    MathSciNet  MATH  Google Scholar 

  14. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  15. Mansley, C., Weinstein, A., Littman, M.: Sample-based planning for continuous action Markov decision processes. In: Twenty-First International Conference on Automated Planning and Scheduling (2011)

    Google Scholar 

  16. Marcus, S.I., Fernández-Gaucherand, E., Hernández-Hernandez, D., Coraluppi, S., Fard, P.: Risk sensitive Markov decision processes. In: Byrnes, C.I., Datta, B.N., Martin, C.F., Gilliam, D.S. (eds.) Systems and Control in the Twenty-First Century. PSCT, vol. 22, pp. 263–279. Springer, Heidelberg (1997). https://doi.org/10.1007/978-1-4612-4120-1_14

    Chapter  Google Scholar 

  17. Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952)

    Google Scholar 

  18. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). ISSN 00280836

    Google Scholar 

  19. Moody, J., Saffell, M.: Learning to trade via direct reinforcement. IEEE Trans. Neural Netw. 12(4), 875–89 (2001)

    Article  Google Scholar 

  20. Moody, J., Wu, L., Liao, Y., Saffell, M.: Performance functions and reinforcement learning for trading systems and portfolios. J. Forecast. 17(5–6), 441–470 (1998)

    Article  Google Scholar 

  21. Nevmyvaka, Y., Feng, Y., Kearns, M.: Reinforcement learning for optimized trade execution. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 673–680. Association for Computing Machinery, New York (2006)

    Google Scholar 

  22. Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement learning for humanoid robotics. In: IEEE-RAS International Conference on Humanoid Robots (Humanoids2003), Karlsruhe, Germany, 29–30 September (2003). CLMC

    Google Scholar 

  23. Plappert, M., et al.: Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 (2017)

  24. de Prado, M.L.: Advances in Financial Machine Learning, 1st edn. Wiley, Hoboken (2018)

    Google Scholar 

  25. Silver, D., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–503 (2016)

    Article  Google Scholar 

  26. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  27. Wang, H., Zhou, X.Y.: Continuous-time mean-variance portfolio selection: a reinforcement learning framework. Math. Financ. 30(4), 1273–1308 (2020)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The research was conducted under a cooperative agreement between ISI Foundation, Intesa Sanpaolo Innovation Center, and Intesa Sanpaolo Vita. The authors would like to thank Lauretta Filangieri, Antonino Galatà, Giuseppe Loforese, Pietro Materozzi and Luigi Ruggerone for their useful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan Perotti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Abrate, C. et al. (2021). Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance Company. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86514-6_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86513-9

  • Online ISBN: 978-3-030-86514-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics