Chain-of-Choice Hierarchical Policy Learning for Conversational Recommendation

Fan, Wei; Zhang, Weijia; Wang, Weiqi; Song, Yangqiu; Liu, Hao

doi:10.1007/978-981-97-5569-1_8

Wei Fan¹⁵,
Weijia Zhang¹⁶,
Weiqi Wang¹⁵,
Yangqiu Song¹⁵ &
…
Hao Liu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14854))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

368 Accesses

Abstract

Conversational Recommender Systems (CRS) illuminate user preferences via multi-round interactive dialogues, ultimately navigating towards precise and satisfactory recommendations. However, contemporary CRS are limited to inquiring binary or multi-choice questions based on a single attribute type (e.g., color) per round, which causes excessive rounds of interaction and diminishes the user’s experience. To address this, we propose a more realistic and efficient conversational recommendation problem setting, called Multi-Type-Attribute Multi-round Conversational Recommendation (MTAMCR), which enables CRS to inquire about multi-choice questions covering multiple types of attributes in each round, thereby improving interactive efficiency. Moreover, by formulating MTAMCR as a hierarchical reinforcement learning task, we propose a Chain-of-Choice Hierarchical Policy Learning (CoCHPL) framework to enhance both the questioning efficiency and recommendation effectiveness in MTAMCR. Specifically, a long-term policy over options (i.e., ask or recommend) determines the action type, while two short-term intra-option policies sequentially generate the chain of attributes or items through multi-step reasoning and selection, optimizing the diversity and interdependence of questioning attributes. Finally, extensive experiments on four benchmarks demonstrate the superior performance of CoCHPL over prevailing state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 159.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards Multi-subsession Conversational Recommendation

Toward joint utilization of absolute and relative bandit feedback for conversational recommendation

Article 27 January 2024

Considering Interaction Sequence of Historical Items for Conversational Recommender System

References

Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: AAAI. p. 1726–1734 (2017)
Google Scholar
Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NeurIPS. pp. 2787–2795 (2013)
Google Scholar
Christakopoulou, K., Radlinski, F., Hofmann, K.: Towards conversational recommender systems. In: SIGKDD. pp. 815–824 (2016)
Google Scholar
Deng, Y., Li, Y., Sun, F., Ding, B., Lam, W.: Unified conversational recommendation policy learning via graph-based reinforcement learning. In: SIGIR. pp. 1431–1441 (2021)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Google Scholar
Lei, W., He, X., Miao, Y., Wu, Q., Hong, R., Kan, M.Y., Chua, T.S.: Estimation-action-reflection: Towards deep interaction between conversational and recommender systems. In: WSDM. pp. 304–312 (2020)
Google Scholar
Lei, W., Zhang, G., He, X., Miao, Y., Wang, X., Chen, L., Chua, T.S.: Interactive path reasoning on graph for conversational recommendation. In: SIGKDD. pp. 2073–2083 (2020)
Google Scholar
Lin, A., Zhu, Z., Wang, J., Caverlee, J.: Enhancing user personalization in conversational recommenders. In: WWW. pp. 770–778 (2023)
Google Scholar
Sun, Y., Zhang, Y.: Conversational recommender system. In: SIGIR. pp. 235–244 (2018)
Google Scholar
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NeurIPS (1999)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112(1-2), 181–211 (1999)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: NeurIPS. pp. 6000–6010 (2017)
Google Scholar
Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., Freitas, N.: Dueling network architectures for deep reinforcement learning. In: ICML (2016)
Google Scholar
Xie, Z., Yu, T., Zhao, C., Li, S.: Comparison-based conversational recommender system with relative bandit feedback. In: SIGIR. pp. 1400–1409 (2021)
Google Scholar
Xu, K., Yang, J., Xu, J., Gao, S., Guo, J., Wen, J.R.: Adapting user preference to online feedback in multi-round conversational recommendation. In: WSDM. pp. 364–372 (2021)
Google Scholar
Zhang, W., Liu, H., Han, J., Ge, Y., Xiong, H.: Multi-agent graph convolutional reinforcement learning for dynamic electric vehicle charging pricing. In: SIGKDD. pp. 2471–2481 (2022)
Google Scholar
Zhang, W., Liu, H., Liu, Y., Zhou, J., Xiong, H.: Semi-supervised hierarchical recurrent graph neural network for city-wide parking availability prediction. In: AAAI. pp. 1186–1193 (2020)
Google Scholar
Zhang, Y., Wu, L., Shen, Q., Pang, Y., Wei, Z., Xu, F., Long, B., Pei, J.: Multiple choice questions based multi-interest policy learning for conversational recommendation. In: WWW. pp. 2153–2162 (2022)
Google Scholar
Zhang, Y., Chen, X., Ai, Q., Yang, L., Croft, W.B.: Towards conversational search and recommendation: System ask, user respond. In: CIKM. pp. 177–186 (2018)
Google Scholar
Zhao, S., Wei, W., Liu, Y., Wang, Z., Li, W., Mao, X.L., Zhu, S., Yang, M., Wen, Z.: Towards hierarchical policy learning for conversational recommendation with hypergraph-based reinforcement learning. In: IJCAI. pp. 2459–2467 (2023)
Google Scholar

Download references

Acknowledgement

This research was supported in part by the National Natural Science Foundation of China under Grant No. 62102110, 92370204, the Guangzhou Basic and Applied Basic Research Program under Grant No. 2024A04J3279, and the Education Bureau of Guangzhou Municipality.

Author information

Authors and Affiliations

The Hong Kong University of Science and Technology, Hong Kong, China
Wei Fan, Weiqi Wang & Yangqiu Song
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Weijia Zhang & Hao Liu

Authors

Wei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Weijia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Weiqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yangqiu Song
View author publications
You can also search for this author in PubMed Google Scholar
Hao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Liu .

Editor information

Editors and Affiliations

Osaka University, Suita, Osaka, Japan
Makoto Onizuka
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
Beihang University, Beijing, China
Yongxin Tong
Osaka University, Osaka, Japan
Chuan Xiao
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
University of Grenoble Alpes, Saint-Martin d’Hères, France
Sihem Amer-Yahia
University of Michigan, Ann Arbor, MI, USA
H. V. Jagadish
Nagoya University, Nagoya, Japan
Kejing Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, W., Zhang, W., Wang, W., Song, Y., Liu, H. (2024). Chain-of-Choice Hierarchical Policy Learning for Conversational Recommendation. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_8

Download citation

DOI: https://doi.org/10.1007/978-981-97-5569-1_8
Published: 13 December 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5568-4
Online ISBN: 978-981-97-5569-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics