skip to main content
10.1145/3488560.3498526acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Learning Relevant Questions for Conversational Product Search using Deep Reinforcement Learning

Published:15 February 2022Publication History

ABSTRACT

We propose RelQuest, a conversational product search model based on reinforcement learning to generate questions from product descriptions in each round of the conversation, directly maximizing any desired metrics (i.e., the ultimate goal of the conversation), objectives, or even an arbitrary user satisfaction signal. By enabling systems to ask questions about user needs, conversational product search has gained increasing attention in recent years. Asking the right questions through conversations helps the system collect valuable feedback to create better user experiences and ultimately increase sales. In contrast, existing conversational product search methods are based on an assumption that there is a set of effectively pre-defined candidate questions for each product to be asked. Moreover, they make strong assumptions to estimate the value of questions in each round of the conversation. Estimating the true value of questions in each round of the conversation is not trivial since it is unknown. Experiments on real-world user purchasing data show the effectiveness of RelQuest at generating questions that maximize standard evaluation measures such as NDCG.

Skip Supplemental Material Section

Supplemental Material

WSDM22-fp849.mp4

mp4

58.6 MB

References

  1. Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W Bruce Croft. 2017. Learning a hierarchical embedding model for personalized product search. In SIGIR'17 . 645--654.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Mozhdeh Ariannezhad, Ali Montazeralghaem, Hamed Zamani, and Azadeh Shakery. 2017. Improving retrieval performance for verbose queries via axiomatic analysis of term discrimination heuristic. In SIGIR'17. 1201--1204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nicholas J Belkin, Colleen Cool, Adelheit Stein, and Ulrich Thiel. 1995. Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. Expert systems with applications , Vol. 9, 3 (1995), 379--395.Google ScholarGoogle Scholar
  4. Keping Bi, Qingyao Ai, Yongfeng Zhang, and W Bruce Croft. 2019. Conversational product search based on negative feedback. In CIKM'19. 359--368.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Haw-Shiuan Chang, Ziyun Wang, Luke Vilnis, and Andrew McCallum. 2017. Distributional inclusion vector embedding for unsupervised hypernymy detection. arXiv preprint arXiv:1710.00880 (2017).Google ScholarGoogle Scholar
  6. Christiane Fellbaum. 2010. WordNet. In Theory and applications of ontology: computer applications. Springer, 231--243.Google ScholarGoogle Scholar
  7. Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM'16. 55--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ayyoob Imani, Amir Vakili, Ali Montazer, and Azadeh Shakery. 2019 a. An Axiomatic Study of Query Terms Order in Ad-hoc Retrieval. In ECIR'19. Springer, 196--202.Google ScholarGoogle Scholar
  9. Ayyoob Imani, Amir Vakili, Ali Montazer, and Azadeh Shakery. 2019 b. Deep neural networks for query expansion using word embeddings. In ECIR'19. Springer, 203--210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) , Vol. 20, 4 (2002), 422--446.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Maryam Karimzadehgan and ChengXiang Zhai. 2011. Improving retrieval accuracy of difficult queries through generalizing negative document language models. In CIKM'11. 27--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR'15 (San Diego, CA, USA).Google ScholarGoogle Scholar
  13. Vijay R Konda and John N Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008--1014.Google ScholarGoogle Scholar
  14. Karol Kurach, Anton Raichuk, Piotr Sta'nczyk, Micha? Zajka c, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, et al. 2020. Google research football: A novel reinforcement learning environment. In AAAI'20 , Vol. 34. 4501--4510.Google ScholarGoogle ScholarCross RefCross Ref
  15. John Lafferty and Chengxiang Zhai. 2001. Document language models, query models, and risk minimization for information retrieval. In SIGIR'01. 111--119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, and Tat-Seng Chua. 2020. Estimation-action-reflection: Towards deep interaction between conversational and recommender systems. In WSDM'20. 304--312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Berenike Litz, Hagen Langer, and Rainer Malaka. 2009. Sequential supervised learning for hypernym discovery from Wikipedia. In IC3K'09. Springer, 68--80.Google ScholarGoogle Scholar
  18. Zhengdong Lu and Hang Li. 2013. A deep architecture for matching short texts. Advances in neural information processing systems , Vol. 26 (2013), 1367--1375.Google ScholarGoogle Scholar
  19. Ali Montazeralghaem, James Allan, and Philip S Thomas. 2021. Large-scale Interactive Conversational Recommendation System using Actor-Critic Framework. In RecSys'21. 220--229.Google ScholarGoogle Scholar
  20. Ali Montazeralghaem, Razieh Rahimi, and James Allan. 2020 a. Relevance Ranking Based on Query-Aware Context Analysis. Advances in Information Retrieval , Vol. 12035 (2020), 446.Google ScholarGoogle Scholar
  21. Ali Montazeralghaem, Hamed Zamani, and James Allan. 2020 b. A reinforcement learning framework for relevance feedback. In SIGIR'20. 59--68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ali Montazeralghaem, Hamed Zamani, and Azadeh Shakery. 2016. Axiomatic analysis for improving the log-logistic feedback model. In SIGIR'16 . 765--768.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In CIKM'17 . 257--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Romain Paulus, Caiming Xiong, and Richard Socher. 2017. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304 (2017).Google ScholarGoogle Scholar
  25. Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, and Martin Riedmiller. 2017. Data-efficient deep reinforcement learning for dexterous manipulation. arXiv preprint arXiv:1704.03073 (2017).Google ScholarGoogle Scholar
  26. Peng Qi, Yuhao Zhang, and Christopher D Manning. 2020. Stay hungry, stay focused: Generating informative and specific questions in information-seeking conversations. arXiv preprint arXiv:2004.14530 (2020).Google ScholarGoogle Scholar
  27. Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender systems handbook . Springer, 1--35.Google ScholarGoogle Scholar
  28. Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR'94. Springer, 232--241.Google ScholarGoogle ScholarCross RefCross Ref
  29. J. J. Rocchio. 1971. Relevance feedback in information retrieval. In The Smart retrieval system - experiments in automatic document processing , , G. Salton (Ed.). Englewood Cliffs, NJ: Prentice-Hall, 313--323.Google ScholarGoogle Scholar
  30. David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1986. Learning representations by back-propagating errors. nature , Vol. 323, 6088 (1986), 533.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. nature , Vol. 529, 7587 (2016), 484.Google ScholarGoogle Scholar
  32. David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017a. Mastering the game of Go without human knowledge. Nature , Vol. 550, 7676 (2017), 354.Google ScholarGoogle Scholar
  33. David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017b. Mastering the game of go without human knowledge. Nature , Vol. 550, 7676 (2017), 354--359.Google ScholarGoogle Scholar
  34. Yueming Sun and Yi Zhang. 2018. Conversational recommender system. In SIGIR'18. 235--244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction .MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas. 2016. Learning latent vector spaces for product search. In CIKM'16. 165--174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lakshmi Vikraman, Ali Montazeralghaem, Helia Hashemi, W Bruce Croft, and James Allan. 2021. Passage Similarity and Diversification in Non-factoid Question Answering. In ICTIR'21 . 271--280.Google ScholarGoogle Scholar
  38. Yansen Wang, Chenyi Liu, Minlie Huang, and Liqiang Nie. 2018. Learning to ask questions in open-domain conversational systems with typed decoders. arXiv preprint arXiv:1805.04843 (2018).Google ScholarGoogle Scholar
  39. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning , Vol. 8, 3--4 (1992), 229--256.Google ScholarGoogle Scholar
  40. Liu Yang, Hamed Zamani, Yongfeng Zhang, Jiafeng Guo, and W Bruce Croft. 2017. Neural matching models for question retrieval and next question prediction in conversation. arXiv preprint arXiv:1707.05409 (2017).Google ScholarGoogle Scholar
  41. Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V Le. 2018. Qanet: Combining local convolution with global self-attention for reading comprehension. arXiv preprint arXiv:1804.09541 (2018).Google ScholarGoogle Scholar
  42. Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W Bruce Croft. 2018. Towards conversational search and recommendation: System ask, user respond. In CIKM'18 . 177--186.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jie Zou, Yifan Chen, and Evangelos Kanoulas. 2020. Towards question-based recommender systems. In SIGIR'20. 881--890.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jie Zou and Evangelos Kanoulas. 2019. Learning to ask: Question-based sequential Bayesian product search. In CIKM'19 . 369--378.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning Relevant Questions for Conversational Product Search using Deep Reinforcement Learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
          February 2022
          1690 pages
          ISBN:9781450391320
          DOI:10.1145/3488560

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 February 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate498of2,863submissions,17%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader