skip to main content
research-article

User Cold-start Problem in Multi-armed Bandits: When the First Recommendations Guide the User’s Experience

Published: 27 January 2023 Publication History

Abstract

Nowadays, Recommender Systems have played a crucial role in several entertainment scenarios by making personalised recommendations and guiding the entire users’ journey from their first interaction. Recent works have addressed it as a Contextual Bandit by providing a sequential decision model to explore items not tried yet (or not tried enough) or exploit the best options learned so far. However, this work noticed these current algorithms are limited to naive non-personalised approaches in the first interactions of a new user, offering random or most popular items. Through experiments in three domains, we identify a negative impact of these first choices. Our study indicates that the bandit performance is directly related to the choices made in the first trials. Then, we propose a new approach to balance exploration and exploitation in the first interactions and handle these drawbacks. This approach is based on the Active Learning theory to catch more information about the new users and improve their long-term experience. Our idea is to explore the potential information gain of items that can also please the user’s taste. This method is named Warm-Starting Contextual Bandits, and it statistically outperforms 10 benchmarks in the literature in the long run.

References

[1]
Rabaa Alabdulrahman, Herna Viktor, and Eric Paquet. 2019. Active learning and deep learning for the cold-start problem in recommendation system: A comparative study. In International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management. Springer, 24–53.
[2]
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 2–3 (2002), 235–256.
[3]
Andrea Barraza-Urbina and Dorota Glowacka. 2020. Introduction to bandits in recommender systems. In Proceedings of the 14th ACM Recommender Systems Conference (RecSys’20). 748–750.
[4]
Jesús Bobadilla, Fernando Ortega, Antonio Hernando, and Abraham Gutiérrez. 2013. Recommender systems survey. Knowl.-Bas. Syst. (2013).
[5]
Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphael Féraud, and Robin Allesiardo. 2014. Contextual bandit for active learning: Active thompson sampling. In International Conference on Neural Information Processing. Springer, 405–412.
[6]
Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. In Advances in Neural Information Processing Systems. 2249–2257.
[7]
James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, et al. 2010. The YouTube video recommendation system. In Proceedings of the 4th ACM Conference on Recommender Systems. 293–296.
[8]
Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook (2011), 107–144.
[9]
Mehdi Elahi, Francesco Ricci, and Neil Rubens. 2016. A survey of active learning in collaborative filtering recommender systems. Comput. Sci. Rev. 20 (2016), 29–50.
[10]
Crícia Felício, Klérisson Paixão, Celia Barcelos, and Philippe Preux. 2017. A multi-armed bandit model selection for cold-start user recommendation. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization.
[11]
Claudio Gentile, Shuai Li, Purushottam Kar, Alexandros Karatzoglou, Giovanni Zappella, and Evans Etrue. 2017. On context-dependent clustering of bandits. In Proceedings of the International Conference on Machine Learning. PMLR, 1253–1262.
[12]
Carlos A. Gomez-Uribe and Neil Hunt. 2015. The netflix recommender system: Algorithms, business value, and innovation. ACM Trans. Manage. Inf. Syst. 6, 4 (2015), 1–19.
[13]
Anjan Goswami, Chengxiang Zhai, and Prasant Mohapatra. 2019. Learning to diversify for e-commerce search with multi-armed bandit. In Proceedings of the SIGIR Workshop on eCommerce (eCOMSIGIR’19).
[14]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, Vol. 30.
[15]
Bowen Hao, Jing Zhang, Hongzhi Yin, Cuiping Li, and Hong Chen. 2021. Pre-training graph neural networks for cold-start users and items representation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 265–273.
[16]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, and Meng Wang. 2020. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 639–648.
[17]
Antonio Hernando, Jesús Bobadilla, Fernando Ortega, and Abraham Gutiérrez. 2017. A probabilistic model for recommending to new cold-start non-registered users. Inf. Sci. 376 (2017), 216–232.
[18]
Steven Hoi, Doyen Sahoo, Jing Lu, and Peilin Zhao. 2018. Online learning: A comprehensive survey.
[19]
Dietmar Jannach and Michael Jugovac. 2019. Measuring the business value of recommender systems. ACM Trans. Manage. Inf. Syst. 10, 4 (2019), 1–23.
[20]
Dietmar Jannach, Markus Zanker, Alexander Felfernig, and Gerhard Friedrich. 2010. Recommender Systems: An Introduction. Cambridge University Press.
[21]
Jaya Kawale, Hung H. Bui, Branislav Kveton, Long T. Thanh, and Sanjay Chawla. 2015. Efficient thompson sampling for online matrix-factorization recommendation. In Advances in Neural Information Processing Systems, Vol 28, 1297–1305.
[22]
Hoyeop Lee, Jinbae Im, Seongwon Jang, Hyunsouk Cho, and Sehee Chung. 2019. MeLU: Meta-learned user preference estimator for cold-start recommendation. In Proceedings of the 25th ACM Conference of the Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD’19). 1073–1082.
[23]
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web. 661–670.
[24]
Shuai Li, Alexandros Karatzoglou, and Claudio Gentile. 2016. Collaborative filtering bandits. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 539–548.
[25]
Yanghao Li, Cuiling Lan, Junliang Xing, Wenjun Zeng, Chunfeng Yuan, and Jiaying Liu. 2016. Online human action detection using joint classification-regression recurrent neural networks. In Proceedings of the European Conference on Computer Vision. Springer, 203–220.
[26]
Jorge Nocedal and Stephen J. Wright. 2006. Sequential quadratic programming. Numer. Optim. (2006), 529–562.
[27]
Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces. 127–134.
[28]
Neil Rubens, Mehdi Elahi, Masashi Sugiyama, and Dain Kaplan. 2015. Active learning in recommender systems. In Recommender Systems Handbook. Springer, 809–846.
[29]
Neil Rubens and Masashi Sugiyama. 2007. Influence-based collaborative active learning. In Proceedings of the ACM Conference on Recommender Systems. 145–148.
[30]
Javier Sanz-Cruzado, Pablo Castells, and Esther López. 2019. A simple multi-armed nearest-neighbor bandit for interactive recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 358–362.
[31]
Burr Settles. 2009. Active learning literature survey.
[32]
Sulthana Shams, Daron Anderson, and Douglas Leith. 2021. Cluster-based bandits: Fast cold-start for recommender system new users.
[33]
Bracha Shapira, Francesco Ricci, Paul B. Kantor, and Lior Rokach. 2011. Recommender Systems Handbook.
[34]
Nícollas Silva, Diego Carvalho, Adriano C. M. Pereira, Fernando Mourão, and Leonardo Rocha. 2019. The pure cold-start problem: A deep study about how to conquer first-time users in recommendations domains. Inf. Syst. 80 (2019), 1–12.
[35]
Thiago Silva, Nícollas Silva, Heitor Werneck, Carlos Mito, Adriano CM Pereira, and Leonardo Rocha. 2022. iRec: An interactive recommendation framework. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3165–3175.
[36]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.
[37]
Huazheng Wang, Qingyun Wu, and Hongning Wang. 2016. Learning hidden features for contextual bandits. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1633–1642.
[38]
Huazheng Wang, Qingyun Wu, and Hongning Wang. 2017. Factorization bandits for interactive recommendation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.
[39]
Lu Wang, Chengyu Wang, Keqiang Wang, and Xiaofeng He. 2017. Biucb: A contextual bandit algorithm for cold-start and diversified recommendation. In Proceedings of the IEEE International Conference on Big Knowledge (ICBK’17). IEEE, 248–253.
[40]
Qing Wang, Chunqiu Zeng, Wubai Zhou, Tao Li, S. Sitharama Iyengar, Larisa Shwartz, and Genady Ya Grabarnik. 2018. Online interactive collaborative filtering using multi-armed bandit with dependent arms. IEEE Trans. Knowl. Data Eng. 31, 8 (2018), 1569–1580.
[41]
Qing Wang, Chunqiu Zeng, Wubai Zhou, Tao Li, S. Sitharama Iyengar, Larisa Shwartz, and Genady Ya Grabarnik. 2018. Online interactive collaborative filtering using multi-armed bandit with dependent arms. IEEE Trans. Knowl. Data Eng. 31, 8 (2018), 1569–1580.
[42]
Qingyun Wu, Naveen Iyer, and Hongning Wang. 2018. Learning contextual bandits in a non-stationary environment. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 495–504.
[43]
Qingyun Wu, Huazheng Wang, Quanquan Gu, and Hongning Wang. 2016. Contextual bandits in a collaborative environment. In Proceedings of the 39th International ACM SIGIR Conference on Development in Information Retrieval.
[44]
Qitian Wu, Hengrui Zhang, Xiaofeng Gao, Peng He, Paul Weng, Han Gao, and Guihai Chen. 2019. Dual graph attention networks for deep latent representation of multifaceted social effects in recommender systems. In Proceedings of the World Wide Web Conference. 2091–2102.
[45]
Xiao Xu, Fang Dong, Yanghua Li, Shaojian He, and Xin Li. 2020. Contextual-bandit based personalized recommendation with time-varying user interests. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20). 6518–6525.
[46]
Xiaoying Zhang, Hong Xie, Hang Li, and John C. S. Lui. 2020. Conversational contextual bandit: Algorithm and application. In Proceedings of the Web Conference. 662–672.
[47]
Xiaoxue Zhao, Weinan Zhang, and Jun Wang. 2013. Interactive collaborative filtering. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 1411–1420.
[48]
Sijin Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, and Yong Yu. 2020. Interactive recommender system via knowledge graph-enhanced reinforcement learning. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 179–188.
[49]
Yu Zhu, Jinghao Lin, Shibi He, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. 2019. Addressing the item cold-start problem by attribute-driven active learning. IEEE Trans. Knowl. Data Eng. 32, 4 (2019), 631–644.
[50]
Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, and Dawei Yin. 2020. Neural interactive collaborative filtering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 749–758.

Cited By

View all
  • (2024)Cold-start recommendation by personalized embedding region elicitationProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702808(2766-2786)Online publication date: 15-Jul-2024
  • (2024)Query Exploration Based on Knowledge ReasoningAdvanced Data Mining and Applications10.1007/978-981-96-0814-0_24(366-382)Online publication date: 3-Dec-2024
  • (2023)A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation SystemsProceedings of the 29th Brazilian Symposium on Multimedia and the Web10.1145/3617023.3617049(193-197)Online publication date: 23-Oct-2023
  • Show More Cited By

Index Terms

  1. User Cold-start Problem in Multi-armed Bandits: When the First Recommendations Guide the User’s Experience

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Recommender Systems
    ACM Transactions on Recommender Systems  Volume 1, Issue 1
    March 2023
    163 pages
    EISSN:2770-6699
    DOI:10.1145/3581755
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 January 2023
    Online AM: 26 October 2022
    Accepted: 24 July 2022
    Revised: 16 July 2022
    Received: 02 May 2022
    Published in TORS Volume 1, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Recommender Systems
    2. multi-armed bandits
    3. user cold-start

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • CNPq
    • CAPES
    • Fapemig
    • AWS
    • INWEB

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)292
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Cold-start recommendation by personalized embedding region elicitationProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702808(2766-2786)Online publication date: 15-Jul-2024
    • (2024)Query Exploration Based on Knowledge ReasoningAdvanced Data Mining and Applications10.1007/978-981-96-0814-0_24(366-382)Online publication date: 3-Dec-2024
    • (2023)A Complete Framework for Offline and Counterfactual Evaluations of Interactive Recommendation SystemsProceedings of the 29th Brazilian Symposium on Multimedia and the Web10.1145/3617023.3617049(193-197)Online publication date: 23-Oct-2023
    • (2023)Transparently Serving the Public: Enhancing Public Service Media Values through ExplorationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3610243(1045-1048)Online publication date: 14-Sep-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media