Skip to main content

A Survey on Reinforcement Learning and Deep Reinforcement Learning for Recommender Systems

  • Conference paper
  • First Online:
Deep Learning Theory and Applications (DeLTA 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1875))

Included in the following conference series:

  • 859 Accesses

Abstract

Recommender systems are quickly taking over our daily lives. By suggesting and customizing the recommended items, they play a significant part in solving the information overload issue. Traditional recommender systems used for simple prediction issues include collaborative filtering, content-based filtering, and hybrid techniques. With new techniques used in recommender systems, such as reinforcement learning algorithms, more difficult problems can be resolved. These issues can be resolved using Markov decision processes and reinforcement learning techniques. It is now possible to employ reinforcement learning techniques to address issues with the huge environment and states, thanks to recent advancements in the field. The development of traditional and reinforcement learning-based methods, their appraisal, difficulties, and suggested future research will be followed by a discussion of the reinforcement learning recommender system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Klašnja-Milićević, A., Ivanović, M., Nanopoulos, A.: Recommender systems in e-learning environments: a survey of the state of the art and possible extensions. Artif. Intell. Rev. 44(4), 571–604 (2015)

    Google Scholar 

  2. Schafer, J.B., Konstan, J., Riedl, J.: Recommender systems in e-commerce. In: ACM Conference on Electronic Commerce, pp. 158–166 (1999)

    Google Scholar 

  3. Sezgin, E., Ozkan, S.: A systematic literature review on health recommender systems. In: E-Health and Bioengineering Conference (EHB), pp. 1–4. IEEE (2013)

    Google Scholar 

  4. Karimi, M., Jannach, D., Jugovac, M.: News recommender systems survey and roads ahead. Inf. Process. Manag. 54(6), 1203–1227 (2018)

    Google Scholar 

  5. Ricci, F., Rokach, L., Shapira, B.: Introduction to Recommender Systems Handbook. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P. (eds.) Recommender Systems Handbook, pp. 1–35. Springer, Boston, MA (2011). https://doi.org/10.1007/978-0-387-85820-3_1

  6. Zhang, S., Yao, L., Sun, A., Tay, Y.: Deep learning based recommender system: a survey and new perspectives. Comput. Surv. (CSUR) 52(1), 1–38 (2019)

    Google Scholar 

  7. Jens Kober, J., Bagnell, A., Peters, J.: Reinforcement learning in robotics: a survey. J. Robot. Res. 32(11), 1238–1274 (2013)

    Article  Google Scholar 

  8. Meyes, R., et al.: Motionplanning for industrial robots using reinforcement learning. Procedia CIRP 63, 107–112 (2017)

    Article  Google Scholar 

  9. Navaei, M., Tabrizi, N.: Machine learning in software development life cycle: a comprehensive review. ENASE, pp. 344–354 (2022)

    Google Scholar 

  10. Sallab, A.E.L., Abdou, M., Perot, E., Yogamani, S.: Deep reinforcement learning framework autonomous driving. Electron. Imaging 2017(19), 70–76 (2017)

    Article  Google Scholar 

  11. You, C., Jianbo, L., Filev, D., Tsiotras, P.: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robot. Auton. Syst. 114, 118 (2019)

    Article  Google Scholar 

  12. Jiang, Z., Xu, D., Liang, J.: A deep reinforcement learning framework for the financial portfolio management problem (2017). arXiv

    Google Scholar 

  13. Guez, A., Vincent, R.D., Avoli, M., Pineau, J.: Adaptive treatment of epilepsy via batch-mode reinforcement learning. In: AAAI, pp. 1671–1678 (2008)

    Google Scholar 

  14. Chen, M., Beutel, A., Covington, P., Jain, S., Belletti, F., Chi, E.H.: Top-k off policy correction for a reinforce recommender system. In: ACM International Conference on Web Search and Data Mining, pp. 456–464 (2019)

    Google Scholar 

  15. Smyth, B., Cotter, P.: A personalised TV listings service for the digital TV age. Knowl.-Based Syst. (2000)

    Google Scholar 

  16. Singh, S., Kearns, M., Litman, D., Walker, M.: Reinforcement learning for spoken dialogue systems. Neural Inf. Process. Syst. 956–962 (2000)

    Google Scholar 

  17. Tetreault, J., Litman, D.: Using reinforcement learning to build a better model of dialogue state. In: European Chapter of the Association for Computational Linguistics (2006)

    Google Scholar 

  18. Sutton, R.S., Bartom, A.G.: Introduction to Reinforcement Learning, vol. 2. MIT Press, Cambridge (2017)

    Google Scholar 

  19. Watkins, C.J.C.H.: Learning from delayed rewards (1989)

    Google Scholar 

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Image net classification with deep convolutional neural networks. Neural Inf. Process. Syst. 1097–1105 (2012)

    Google Scholar 

  21. Goldberg, D., Nichols, D., Terry, D., Oki, B.M.: Using collaborative filtering to weave an information tapestry. ACM 35(12), 61–70 (1992)

    Article  Google Scholar 

  22. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning (2015). arXiv

    Google Scholar 

  23. Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., Freitas, N.: Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning (2016)

    Google Scholar 

  24. Shani, G., Heckerman, D., Brafman, R.I., Boutilier, C.: An MDP based recommender system. Mach. Learn. Res. J. 6(Sep), 1265–1295 (2005)

    Google Scholar 

  25. Dulac-Arnold, G., et al.:

    Google Scholar 

  26. Joachims, T., Freitag, D., Mitchell, T.: Webwatcher: a tour guide for the world wide web. In: IJCAI (1), pp. 770–777. Citeseer (1997)

    Google Scholar 

  27. Srivihok, A., Sukonmanee, P.: Ecommerce intelligent agent: personalization travel support agent using Q learning. In: 7th International Conference on Electronic Commerce, pp. 287–292 (2005)

    Google Scholar 

  28. Taghipour, N., Kardan, A., Ghidary, S.S.: Usage based web recommendations: a reinforcement learning approach. In: ACM Conference on Recommender Systems, pp. 113–120 (2007)

    Google Scholar 

  29. Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. ACM 43(8), 142–151 (2000)

    Google Scholar 

  30. Thomas, P.S., Theocharous, G. Rojanavasu, P., Srinil, P., Pinngern, O.: New recommendation systemusing reinforcement learning. Spec. Issue Int. J. Comput. Internet Manag. 13(SP 3) (2005)

    Google Scholar 

  31. Intayoad, W., Kamyod, C., Temdee, P.: Reinforcement learning for online learning recommendation system. In: 2018 Global Wireless Summit (GWS), pp. 167–170. IEEE (2018)

    Google Scholar 

  32. Chi, C.Y., Tsai, R.T.H., Lai, J.Y., Hsu, J.Y.J.: A reinforcement learning approach to emotion-based automatic playlist generation. In: 2010 International Conference on Technologies and Applications of Artificial Intelligence, pp. 60–65. IEEE (2010)

    Google Scholar 

  33. Choi, S., Ha, H., Hwang, U., Kim, C., Ha, J.W., Yoon, S.: Reinforcement learning based recommender system using biclustering technique (2018). arXiv preprint arXiv:1801.05532

  34. Prelic, A., et al.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)

    Google Scholar 

  35. Rodriguez-Baena, D.S., Perez-Pulido, A.J., Aguilar-Ruiz, J.S.: A biclustering algorithm for extracting bit-patterns from binary datasets. Bioinformatics 27(19), 2738–2745 (2011)

    Google Scholar 

  36. Bohnenberger, T., Jameson, A.: When policies are better than plans: decision theoretic planning of recommendation sequences. In: International Conference on Intelligent User Interfaces, pp. 21–24 (2001)

    Google Scholar 

  37. Liebman, E., Saar-Tsechansky, M., Stone, P.: Dj-mc: a reinforcement earning agent for music playlist recommendation (2014). arXiv

    Google Scholar 

  38. Qi, F., Tong, X., Yu, L., Wang, Y.: Personalized project recommendations: using reinforcement learning. EURASIP J. Wirel. Commun. Netw. 2019(1), 1–17 (2019). https://doi.org/10.1186/s13638-019-1619-6

    Article  Google Scholar 

  39. Wang, Y.: A hybrid recommendation for music based on reinforcement learning. In: Lauw, H., Wong, R.W., Ntoulas, A., Lim, E.P., Ng, S.K., Pan, S. (eds.) Advances in Knowledge Discovery and Data Mining. PAKDD 2020. LNCS, vol. 12084, pp. 91–103. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47426-3_8

  40. Zou, L., Xia, L., Ding, Z., Yin, D., Song, J., Liu, W.: Reinforcement learning to diversify Top-N recommendation. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) Database Systems for Advanced Applications. DASFAA 2019. LNCS, vol. 11447, pp. 104–120. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18579-4_7

  41. Zhao, Y., Zeng, D., Socinski, M.A., Kosorok, M.R.: Reinforcement learning strategies forclinical trials in nonsmall cell lung cancer (2011)

    Google Scholar 

  42. Shortreed, S.M., Laber, E., Lizotte, D.J., Scott Stroup, T., Pineau, J., Murphy, S.A.: Informing sequential clinical decision making through reinforcement learning: an empirical study. Mach. Learn. 84(1–2), 109–136 (2011)

    Article  MathSciNet  Google Scholar 

  43. Theocharous, G., Thomas, P.S., Ghavamzadeh, M.: Personalized ad recommendation systems for lifetime value optimization with guarantees. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)

    Google Scholar 

  44. Vapnik, V.: The Nature of Statistical Learning Theory. Springer science & business media (2013)

    Google Scholar 

  45. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, vol. 793. John Wiley, Hoboken (2019)

    Google Scholar 

  46. Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J. Mach. Learn. Res. 6(Apr), 503–56 (2005)

    Google Scholar 

  47. Sunehag, P., Evans, R., Dulac-Arnold, G., Zwols, Y., Visentin, D., Coppin, B.: Deep reinforcement learning with attention for slate Markov decision processes with high dimensional states and actions (2015). arXiv preprint arXiv:1512.01124

  48. Ie, E., et al.: Reinforcement learning for slate-based recommender systems: a tractable decomposition and practical methodology (2019). arXiv preprint arXiv:1905.12767

  49. Nemati, S., Ghassemi, M.M., Clifford, G.D.: Optimal medication dosing fromsuboptimal clinical examples: a deep reinforcementlearning approach. Eng. Med. Biol. Soc. 2978–2981. IEEE (2016)

    Google Scholar 

  50. Raghu, A., Komorowski, M., Ahmed, I., Celi, L., Szolovits, P., Ghassemi, M.: Deep reinforcement learning for sepsis treatment (2017). arXiv preprint arXiv:1711.09602

  51. Chen, X., Li, S., Li, H., Jiang, S., Qi, Y., Song, L.: Generative adversarial user model for reinforcement learning based recommendation system. In: International Conference on Machine Learning, pp. 1052–1061 (2019)

    Google Scholar 

  52. Chen, S.Y., Yu, Y., Da, Q., Tan, J., Huang, H.K., Tang, H.H.: Stabilizing reinforcement learning in dynamic environment with application to online recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)

    Google Scholar 

  53. Yuyan, Z., Xiayao, S., Yong, L.: A novel movie recommendation system based on deep reinforcement learning with prioritized experience replay. In: 2019 IEEE 19th International Conference on Communication Technology (ICCT), pp. 1496–1500. IEEE (2019)

    Google Scholar 

  54. Zhao, X., Xia, L., Yin, D., Tang, J.: Model-based reinforcement learning for wholechain recommendations (2019). arXiv preprint arXiv:1902.03987

  55. Casanueva, I., et al.: Deep reinforcement learning for recommender systems. In: 2018 International Conference on Information and Communications Technology (icoiact), pp. 226–233. IEEE (2018)

    Google Scholar 

  56. Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018)

    Google Scholar 

  57. Zhao, C., Hu, L.: CapDRL: a deep capsule reinforcement learning for movie recommendation. In: Nayak, A., Sharma, A. (eds.) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. LNCS, vol. 11672, pp. 734–739. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29894-4_59

  58. Greco, C., Suglia, A., Basile, P., Semeraro, G.: Converse-et-impera: exploiting deep learning and hierarchical reinforcement learning for conversational recommender systems. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds.) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. LNCS, vol. 10640, pp. 372–386. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70169-1_28

  59. Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. Neural Inf. Process. Syst. 3675–3683 (2016)

    Google Scholar 

  60. Liang, H.: Drprofling: deep reinforcement user pro ling for recommendations in heterogenous information networks. IEEE Knowl. Data Eng. (2020)

    Google Scholar 

  61. Liu, S., Chen, Y., Huang, H., Xiao, L., Hei, X.: Towards smart educational recommendations with reinforcement learning in classroom. In: International Conference on Teaching, Assessment, and Learning for Engineering, pp. 1079–1084. IEEE (2018)

    Google Scholar 

  62. Den Hengst, F., Hoogendoorn, M., Van Harmelen, F., Bosman, J.: Reinforcement learning for personalized dialogue management. In: International Conference on Web Intelligence (2019)

    Google Scholar 

  63. Fotopoulou, E., Zafeiropoulos, A., Feidakis, M., Metafas, D., Papavassiliou, S.: An interactive recommender system based on reinforcement learning for improving emotional competences in educational groups. In: Kumar, V., Troussas, C. (eds.) Intelligent Tutoring Systems. ITS 2020. LNCS, vol. 12149, pp. 248–258. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49663-0_29

  64. Mahmood, T., Ricci, F.: Learning and adaptivity in interactive recommender systems. In: Conference on Electronic Commerce, pp. 75–84 (2007)

    Google Scholar 

  65. Preda, M., Popescu, D.: Personalized web recommendations: supporting epistemic information about end-users. In: The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI’05), pp. 692–695. IEEE (2005)

    Google Scholar 

  66. Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)

    Article  Google Scholar 

  67. Thrun, S., Schwartz, A.: Issues in using function approximation for reinforcement learning. Connectionist Models Summer School Hillsdale. Lawrence Erlbaum, NJ (1993)

    Google Scholar 

  68. Yu, T., Shen, Y., Zhang, R., Zeng, X., Jin, H.: Vision-language recommendation via attribute augmented multimodal reinforcement learning. In: ACM International Conference on Multimedia, pp. 39–47 (2019)

    Google Scholar 

  69. Xian, Y., Fu, Z., Muthukrishnan, S., De Melo, G., Zhang, Y.: Reinforcement knowledge graph reasoning for explainable recommendation. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 285–294 (2019)

    Google Scholar 

  70. Zhang, Y., Chen, X.: Explainable recommendation: a survey and new perspectives (2018). arXiv:1804.11192

  71. Cosley, D., Lam, S.K., Albert, I., Konstan, J.A., Riedl, J.: Is seeing believing how recommender system interfaces a ect users’ opinions. In: Conference on Human Factors in Computing Systems, pp. 585–592 (2003)

    Google Scholar 

  72. Chen, L., Pu, P.: Trust building in recommender agents workshop on web personalization, Recommender Systems and Intelligent User Interfaces at the 2nd International Conference on E-Business, pp. 135–145. Citeseer (2005)

    Google Scholar 

  73. Tintarev, N., Mastho, J.: Exective explanations of recommendations: usercentered design. In: ACM Conference on Recommender Systems, pp. 153–156 (2007)

    Google Scholar 

  74. Lipton, Z.C.: The mythos of model interpretability. Queue 16(3), 31–57 (2018)

    Article  Google Scholar 

  75. Wang, X., Chen, Y., Yang, J., Wu, L., Wu, Z., Xie, X.: A reinforcement learning framework for explainable recommendation. In: Conference on Data Mining, pp. 587–596. IEEE (2018)

    Google Scholar 

  76. Barto, A.G.: Reinforcement learning and dynamic programming. In: Analysis, Design and Evaluation of Man Machine Systems, pp. 407–412. Elsevier (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehrdad Rezaei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rezaei, M., Tabrizi, N. (2023). A Survey on Reinforcement Learning and Deep Reinforcement Learning for Recommender Systems. In: Conte, D., Fred, A., Gusikhin, O., Sansone, C. (eds) Deep Learning Theory and Applications. DeLTA 2023. Communications in Computer and Information Science, vol 1875. Springer, Cham. https://doi.org/10.1007/978-3-031-39059-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39059-3_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39058-6

  • Online ISBN: 978-3-031-39059-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics