skip to main content
10.1145/3583780.3615481acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Graph Learning for Exploratory Query Suggestions in an Instant Search System

Published:21 October 2023Publication History

ABSTRACT

Search systems in online content platforms are typically biased toward a minority of highly consumed items, reflecting the most common user behavior of navigating toward content that is already familiar and popular. Query suggestions are a powerful tool to support query formulation and to encourage exploratory search and content discovery. However, classic approaches for query suggestions typically rely either on semantic similarity, which lacks diversity and does not reflect user searching behavior, or on a collaborative similarity measure mined from search logs, which suffers from data sparsity and is biased by highly popular queries. In this work, we argue that the task of query suggestion can be modelled as a link prediction task on a heterogeneous graph including queries and documents, enabling Graph Learning methods to effectively generate query suggestions encompassing both semantic and collaborative information. We perform an offline evaluation on an internal Spotify dataset of search logs and on two public datasets, showing that node2vec leads to an accurate and diversified set of results, especially on the large scale real-world data. We then describe the implementation in an instant search scenario and discuss a set of additional challenges tied to the specific production environment. Finally, we report the results of a large scale A/B test involving millions of users and prove that node2vec query suggestions lead to an increase in online metrics such as coverage (+1.42% shown search results pages with suggestions) and engagement (+1.21% clicks), with a specifically notable boost in the number of clicks on exploratory search queries (+9.37%).

References

  1. Kumaripaba Athukorala, Antti Oulasvirta, Dorota Głowacka, Jilles Vreeken, and Giulio Jacucci. 2014. Narrow or broad? Estimating subjective specificity in exploratory search. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 819--828.Google ScholarGoogle Scholar
  2. Ricardo Baeza-Yates, Carlos Hurtado, and Marcelo Mendoza. 2005. Query recommendation using query logs in search engines. In Current Trends in Database Technology-EDBT 2004 Workshops: EDBT 2004 Workshops PhD, DataX, PIM, P2P&DB, and ClustWeb, Heraklion, Crete, Greece, March 14--18, 2004. Revised Selected Papers 9. Springer, 588--596.Google ScholarGoogle Scholar
  3. Ricardo Baeza-Yates and Alessandro Tiberi. 2007. Extracting semantic relations from query logs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. 76--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ranieri Baraglia, Fidel Cacheda, Victor Carneiro, Diego Fernandez, Vreixo Formoso, Raffaele Perego, and Fabrizio Silvestri. 2009. Search shortcuts: a new approach to the recommendation of queries. In Proceedings of the third ACM Conference on Recommender Systems. 77--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Doug Beeferman and Adam Berger. 2000. Agglomerative clustering of a search engine query log. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. 407--416.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sumit Bhatia, Debapriyo Majumdar, and Prasenjit Mitra. 2011. Query suggestions in the absence of query logs. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 795--804.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, and Sebastiano Vigna. 2009. Query suggestions using query-flow graphs. In Proceedings of the 2009 workshop on Web Search Click Data. 56--63.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nick Craswell and Martin Szummer. 2007. Random walks on the click graph. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. 239--246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems. 39--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  11. Wai-Tat Fu, Thomas G Kannampallil, and Ruogu Kang. 2010. Facilitating exploratory search by model-based navigational cues. In Proceedings of the 15th international conference on Intelligent user interfaces. 199--208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, Vol. 151 (2018), 78--94.Google ScholarGoogle ScholarCross RefCross Ref
  13. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chien-Kang Huang, Lee-Feng Chien, and Yen-Jen Oyang. 2003. Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, Vol. 54, 7 (2003), 638--649.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web. 271--279.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jyun-Yu Jiang and Wei Wang. 2018. RIN: Reformulation inference network for context-aware query suggestion. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 197--206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web. 387--396.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ang Li, Jennifer Thom, Praveen Chandar, Christine Hosey, Brian St Thomas, and Jean Garcia-Gathright. 2019. Search mindsets: Understanding focused and non-focused information seeking in music search. In The World Wide Web Conference. 2971--2977.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xinyi Liu, Wanxian Guan, Lianyun Li, Hui Li, Chen Lin, Xubin Li, Si Chen, Jian Xu, Hongbo Deng, and Bo Zheng. 2022. Pretraining Representations of Multi-modal Multi-query E-commerce Search. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3429--3437.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hehuan Ma, Yu Rong, and Junzhou Huang. 2022. Graph Neural Networks: Scalability. In Graph Neural Networks: Foundations, Frontiers, and Applications, Lingfei Wu, Peng Cui, Jian Pei, and Liang Zhao (Eds.). Springer Singapore, Singapore, 99--119.Google ScholarGoogle Scholar
  21. Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, Vol. 42, 4 (2018), 824--836.Google ScholarGoogle Scholar
  22. Christopher D Manning. 2008. Introduction to information retrieval. Syngress Publishing,.Google ScholarGoogle Scholar
  23. Gary Marchionini. 2006. Exploratory search: from finding to understanding. Commun. ACM, Vol. 49, 4 (2006), 41--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, Vol. 26 (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Agnès Mustar, Sylvain Lamprier, and Benjamin Piwowarski. 2021. On the study of transformers for query suggestion. ACM Transactions on Information Systems (TOIS), Vol. 40, 1 (2021), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In CoCo@ NIPs.Google ScholarGoogle Scholar
  27. Emilie Palagi, Fabien Gandon, Alain Giboin, and Raphaël Troncy. 2017. A survey of definitions and models of exploratory search. In Proceedings of the 2017 ACM workshop on exploratory search and interactive data analytics. 3--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Enrico Palumbo, Andrea Mezzalira, Cristina Sánchez-Marco, Alessandro Manzotti, and Daniele Amberti. 2020a. Semantic Diversity for Natural Language Understanding Evaluation in Dialog Systems. In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. 44--49.Google ScholarGoogle Scholar
  29. Enrico Palumbo, Diego Monti, Giuseppe Rizzo, Raphaël Troncy, and Elena Baralis. 2020b. entity2rec: Property-specific knowledge graph embeddings for item recommendation. Expert Systems with Applications, Vol. 151 (2020), 113235.Google ScholarGoogle ScholarCross RefCross Ref
  30. Enrico Palumbo, Giuseppe Rizzo, Raphaël Troncy, Elena Baralis, Michele Osella, and Enrico Ferro. 2018. Knowledge graph embeddings with node2vec for item recommendation. In European semantic web conference. Springer, 117--120.Google ScholarGoogle Scholar
  31. Greg Pass, Abdur Chowdhury, and Cayley Torgeson. 2006. A picture of search. In Proceedings of the 1st international conference on Scalable information systems. 1--es.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701--710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).Google ScholarGoogle Scholar
  34. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).Google ScholarGoogle Scholar
  35. Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google ScholarGoogle Scholar
  36. Harald Steck. 2013. Evaluation of recommendations: rating-prediction and ranking. In Proceedings of the 7th ACM conference on Recommender systems. 213--220.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Federico Tomasi, Rishabh Mehrotra, Aasish Pappu, Judith Bütepage, Brian Brost, Hugo Galv ao, and Mounia Lalmas. 2020. Query Understanding for Surfacing Under-served Music Content. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2765--2772.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ryen W White and Resa A Roth. 2009. Exploratory search: Beyond the query-response paradigm. Synthesis lectures on information concepts, retrieval, and services, Vol. 1, 1 (2009), 1--98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Bin Wu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Query suggestion with feedback memory network. In Proceedings of the 2018 World Wide Web Conference. 1563--1571.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, Vol. 32, 1 (2020), 4--24.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
    October 2023
    5508 pages
    ISBN:9798400701245
    DOI:10.1145/3583780

    Copyright © 2023 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 21 October 2023

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,861of8,427submissions,22%

    Upcoming Conference

  • Article Metrics

    • Downloads (Last 12 months)332
    • Downloads (Last 6 weeks)45

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader