skip to main content
10.1145/3485447.3512152acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Knowledge-aware Conversational Preference Elicitation with Bandit Feedback

Published: 25 April 2022 Publication History

Abstract

Conversational recommender systems (CRSs) have been proposed recently to mitigate the cold-start problem suffered by the traditional recommender systems. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration and elicit the user preferences faster and more accurately. However, existing conversational recommenders leveraging key-terms heavily rely on the availability and quality of the key-terms, and their performances might degrade significantly when the key-terms are incomplete or not well labeled, which usually happens when there are new items being consistently incorporated into the systems and involving lots of human efforts to acquire well-labeled key-terms is costly. Besides, existing CRS methods leverage the feedback to different conversational key-terms separately, without considering the underlying relations between the key-terms. In this case, the learning of the conversational recommenders is sample inefficient, especially when there is a large number of candidate conversational key-terms.
In this paper, we propose a knowledge-aware conversational preference elicitation framework and a bandit-based algorithm GraphConUCB. To achieve efficient preference elicitation given items with incompletely labeled key-terms, our algorithm leverage the underlying relations between the key-terms, guided by the knowledge graph. Being knowledge-aware, our algorithm propagates the user preferences via a pseudo graph feedback module, which also accelerates the exploration in the large action space of key-terms and improves the conversational sample efficiency. To select the most informative conversational key-terms in the graphs to conduct conversations, we further devise a graph-based optimal design module which leverages the graph structure. We provide the theoretical analysis of the regret upper bound for GraphConUCB. With extensive experiments, we show that our algorithm can effectively handle the items with incompletely labeled key-terms, and improves over the state-of-the-art baselines significantly.

References

[1]
Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. 2011. Improved Algorithms for Linear Stochastic Bandits. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 2312–2320.
[2]
Naoki Abe and Philip M. Long. 1999. Associative Reinforcement Learning using Linear Probabilistic Concepts. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, June 27 - 30, 1999. Morgan Kaufmann, 3–11.
[3]
Charu C Aggarwal 2016. Recommender systems. Vol. 1. Springer.
[4]
Noga Alon, Nicolò Cesa-Bianchi, Ofer Dekel, and Tomer Koren. 2015. Online Learning with Feedback Graphs: Beyond Bandits. In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015(JMLR Workshop and Conference Proceedings, Vol. 40). JMLR.org, 23–35.
[5]
Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Shie Mannor, Yishay Mansour, and Ohad Shamir. 2017. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback. SIAM J. Comput. 46, 6 (2017), 1785–1826.
[6]
Anthony Atkinson, Alexander Donev, and Randall Tobias. 2007. Optimum experimental designs, with SAS. Vol. 34. Oxford University Press.
[7]
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2-3 (2002), 235–256.
[8]
Stephen Boyd, Stephen P Boyd, and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
[9]
Qibin Chen, Junyang Lin, Yichang Zhang, Ming Ding, Yukuo Cen, Hongxia Yang, and Jie Tang. 2019. Towards Knowledge-Based Recommender Dialog System. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, 1803–1813.
[10]
Konstantina Christakopoulou, Alex Beutel, Rui Li, Sagar Jain, and Ed H. Chi. 2018. Q&R: A Two-Stage Approach toward Interactive Recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018. ACM, 139–148.
[11]
Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM, 815–824.
[12]
Varsha Dani, Thomas P. Hayes, and Sham M. Kakade. 2008. Stochastic Linear Optimization under Bandit Feedback. In 21st Annual Conference on Learning Theory - COLT 2008, Helsinki, Finland, July 9-12, 2008. Omnipress, 355–366.
[13]
Yang Deng, Yaliang Li, Fei Sun, Bolin Ding, and Wai Lam. 2021. Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. ACM, 1431–1441.
[14]
Zuohui Fu, Yikun Xian, Yaxin Zhu, Yongfeng Zhang, and Gerard de Melo. 2020. COOKIE: A Dataset for Conversational Recommendation over Knowledge Graphs in E-commerce. CoRR abs/2008.09237(2020). arXiv:2008.09237
[15]
Sudeep Gandhe and David R. Traum. 2008. Evaluation Understudy for Dialogue Coherence Models. In Proceedings of the SIGDIAL 2008 Workshop, The 9th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 19-20 June 2008, Ohio State University, Columbus, Ohio, USA. The Association for Computer Linguistics, 172–181.
[16]
Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2021. Advances and Challenges in Conversational Recommender Systems: A Survey. CoRR abs/2101.09459(2021). arXiv:2101.09459
[17]
Roger A Horn and Charles R Johnson. 2012. Matrix analysis. Cambridge university press.
[18]
Dietmar Jannach and Michael Jugovac. 2019. Measuring the Business Value of Recommender Systems. ACM Trans. Manag. Inf. Syst. 10, 4 (2019), 16:1–16:23.
[19]
Jack Kiefer and Jacob Wolfowitz. 1960. The equivalence of two extremum problems. Canadian Journal of Mathematics 12 (1960), 363–366.
[20]
Tomáš Kocák, Gergely Neu, Michal Valko, and Rémi Munos. 2014. Efficient learning by implicit exploration in bandit problems with side observations. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. 613–621.
[21]
Tze Leung Lai and Herbert Robbins. 1985. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics 6, 1 (1985), 4–22.
[22]
Tor Lattimore and Csaba Szepesvári. 2020. Bandit algorithms. Cambridge University Press.
[23]
Chung-Wei Lee, Haipeng Luo, and Mengxiao Zhang. 2020. A Closer Look at Small-loss Bounds for Bandits with Graph Feedback. In Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria](Proceedings of Machine Learning Research, Vol. 125). PMLR, 2516–2564.
[24]
Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2020. Conversational Recommendation: Formulation, Methods, and Evaluation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 2425–2428.
[25]
Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, and Tat-Seng Chua. 2020. Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems. In WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3-7, 2020. ACM, 304–312.
[26]
Wenqiang Lei, Gangyi Zhang, Xiangnan He, Yisong Miao, Xiang Wang, Liang Chen, and Tat-Seng Chua. 2020. Interactive Path Reasoning on Graph for Conversational Recommendation. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 2073–2083.
[27]
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010. ACM, 661–670.
[28]
Raymond Li, Samira Ebrahimi Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, and Chris Pal. 2018. Towards Deep Conversational Recommendations. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. 9748–9758.
[29]
Shuai Li, Wei Chen, Shuai Li, and Kwong-Sak Leung. 2019. Improved Algorithm on Online Clustering of Bandits. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org, 2923–2929.
[30]
Shuai Li, Wei Chen, Zheng Wen, and Kwong-Sak Leung. 2020. Stochastic Online Learning with Probabilistic Graph Feedback. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 4675–4682.
[31]
Shuai Li, Tor Lattimore, and Csaba Szepesvári. 2019. Online Learning to Rank with Features. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA(Proceedings of Machine Learning Research, Vol. 97). PMLR, 3856–3865.
[32]
Shijun Li, Wenqiang Lei, Qingyun Wu, Xiangnan He, Peng Jiang, and Tat-Seng Chua. 2021. Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-start Users. ACM Trans. Inf. Syst. 39, 4 (2021), 40:1–40:29.
[33]
Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, and Ting Liu. 2020. Towards Conversational Recommendation over Multi-Type Dialogs. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 1036–1049.
[34]
Kanak Mahadik, Qingyun Wu, Shuai Li, and Amit Sabne. 2020. Fast distributed bandits for online recommendation systems. In ICS ’20: 2020 International Conference on Supercomputing, Barcelona Spain, June, 2020. ACM, 4:1–4:13.
[35]
Shie Mannor and Ohad Shamir. 2011. From Bandits to Experts: On the Value of Side-Observations. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 684–692.
[36]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. 8024–8035.
[37]
Francesco Ricci, Lior Rokach, Bracha Shapira, and B. Paul Kantor. 2015. Recommender Systems Handbook. Recommender Systems Handbook(2015).
[38]
Amit Singhal. 2001. Modern Information Retrieval: A Brief Overview. IEEE Data Eng. Bull. 24, 4 (2001), 35–43.
[39]
Aleksandrs Slivkins. 2011. Contextual Bandits with Similarity Information. In COLT 2011 - The 24th Annual Conference on Learning Theory, June 9-11, 2011, Budapest, Hungary(JMLR Proceedings, Vol. 19). JMLR.org, 679–702.
[40]
Yueming Sun and Yi Zhang. 2018. Conversational Recommender System. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018. ACM, 235–244.
[41]
Panagiotis Symeonidis and Eleftherios Tiakas. 2014. Transitive node similarity: predicting and recommending links in signed social networks. World Wide Web 17, 4 (2014), 743–776.
[42]
Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, and Minyi Guo. 2019. Knowledge Graph Convolutional Networks for Recommender Systems. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. ACM, 3307–3313.
[43]
Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, and Tat-Seng Chua. 2019. KGAT: Knowledge Graph Attention Network for Recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. ACM, 950–958.
[44]
Junda Wu, Canzhe Zhao, Tong Yu, Jingyang Li, and Shuai Li. 2021. Clustering of Conversational Bandits for User Preference Learning and Elicitation. In CIKM ’21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1 - 5, 2021. ACM, 2129–2139.
[45]
Zhihui Xie, Tong Yu, Canzhe Zhao, and Shuai Li. 2021. Comparison-based Conversational Recommender System with Relative Bandit Feedback. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. ACM, 1400–1409.
[46]
Kerui Xu, Jingxuan Yang, Jun Xu, Sheng Gao, Jun Guo, and Ji-Rong Wen. 2021. Adapting User Preference to Online Feedback in Multi-round Conversational Recommendation. In WSDM ’21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. ACM, 364–372.
[47]
Tong Yu, Yilin Shen, and Hongxia Jin. 2019. A Visual Dialog Augmented Interactive Recommender System. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. ACM, 157–165.
[48]
Xiaoying Zhang, Hong Xie, Hang Li, and John C. S. Lui. 2020. Conversational Contextual Bandit: Algorithm and Application. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. ACM / IW3C2, 662–672.
[49]
Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards Conversational Search and Recommendation: System Ask, User Respond. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22-26, 2018. ACM, 177–186.
[50]
Kun Zhou, Wayne Xin Zhao, Shuqing Bian, Yuanhang Zhou, Ji-Rong Wen, and Jingsong Yu. 2020. Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 1006–1014.
[51]
Kun Zhou, Yuanhang Zhou, Wayne Xin Zhao, Xiaoke Wang, and Ji-Rong Wen. 2020. Towards Topic-Guided Conversational Recommender System. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020. International Committee on Computational Linguistics, 4128–4139.
[52]
Jie Zou, Yifan Chen, and Evangelos Kanoulas. 2020. Towards Question-based Recommender Systems. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 881–890.

Cited By

View all
  • (2024)FedConPEProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/501(4533-4541)Online publication date: 3-Aug-2024
  • (2024)Conversational Dueling Bandits in Generalized Linear ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671892(3806-3817)Online publication date: 25-Aug-2024
  • (2024)Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657815(796-806)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Knowledge-aware Conversational Preference Elicitation with Bandit Feedback
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            WWW '22: Proceedings of the ACM Web Conference 2022
            April 2022
            3764 pages
            ISBN:9781450390965
            DOI:10.1145/3485447
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 25 April 2022

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. conversational recommender
            2. online learning

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            WWW '22
            Sponsor:
            WWW '22: The ACM Web Conference 2022
            April 25 - 29, 2022
            Virtual Event, Lyon, France

            Acceptance Rates

            Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)74
            • Downloads (Last 6 weeks)7
            Reflects downloads up to 08 Mar 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)FedConPEProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/501(4533-4541)Online publication date: 3-Aug-2024
            • (2024)Conversational Dueling Bandits in Generalized Linear ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671892(3806-3817)Online publication date: 25-Aug-2024
            • (2024)Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657815(796-806)Online publication date: 10-Jul-2024
            • (2024)Online Learning and Detecting Corrupted Users for Conversational Recommendation SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.344825036:12(8939-8953)Online publication date: Dec-2024
            • (2024)Conversational Recommendation With Online Learning and Clustering on Misspecified UsersIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342344236:12(7825-7838)Online publication date: Dec-2024
            • (2024)Toward joint utilization of absolute and relative bandit feedback for conversational recommendationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09388-534:5(1707-1744)Online publication date: 1-Nov-2024
            • (2024)Contextual Bandit with Herding Effects: Algorithms and Recommendation ApplicationsPRICAI 2024: Trends in Artificial Intelligence10.1007/978-981-96-0128-8_12(132-144)Online publication date: 12-Nov-2024
            • (2023)Confident Action Decision via Hierarchical Policy Learning for Conversational RecommendationProceedings of the ACM Web Conference 202310.1145/3543507.3583536(1386-1395)Online publication date: 30-Apr-2023
            • (2023)Meta Policy Learning for Cold-Start Conversational RecommendationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570443(222-230)Online publication date: 27-Feb-2023
            • (2023)Conversational Contextual Bandits: The Generalized Linear Case2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150958(1-8)Online publication date: 21-Apr-2023
            • Show More Cited By

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media