skip to main content
10.1145/3539813.3545136acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

PRE: A Precision-Recall-Effort Optimization Framework for Query Simulation

Published:25 August 2022Publication History

ABSTRACT

We study how to develop an interpretable query simulation framework that can potentially explain the process a real user might have used to formulate a query and propose a novel interpretable optimization framework (PRE) for simulating query formulation and reformulation uniformly based on a user's knowledge state, where the three high-level objectives are to maximize the precision and recall of the anticipated retrieval results and minimize the user effort. We propose probabilistic models to model how a user might estimate precision and recall for a candidate query and derive multiple specific query formulation algorithms. Evaluation results show that the major assumptions made in the PRE framework appear to be reasonable, matching the observed empirical result patterns. PRE provides specific hypotheses about a user's query formulation process that can be further examined via user studies, enables simulation of meaningful variations of users without requiring extra training data, and serves as a roadmap for systematic exploration and derivation of new interpretable query simulation methods.

References

  1. Leif Azzopardi. 2011. The economics in interactive information retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 15--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Leif Azzopardi. 2014. Modelling interaction with economic models of search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 3--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Leif Azzopardi, Maarten De Rijke, and Krisztian Balog. 2007. Building simulated queries for known-item topics: an analysis using six european languages. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 455--462.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Krisztian Balog, David Maxwell, Paul Thomas, and Shuo Zhang. 2022. Report on the 1st simulation for information retrieval workshop (Sim4IR 2021) at SIGIR 2021. In ACM SIGIR Forum, Vol. 55. ACM New York, NY, USA, 1--16.Google ScholarGoogle Scholar
  5. F. Baskaya. 2014. Simulating Search Sessions in Interactive Information Retrieval Evaluation. PhD thesis, University of Tempere.Google ScholarGoogle Scholar
  6. F. Baskaya, H. Keskustalo, and K. Jarvelin. 2011. Simulating simple and fallible relevance feedback. In Proceedings of ECIR.Google ScholarGoogle Scholar
  7. Feza Baskaya, Heikki Keskustalo, and Kalervo Järvelin. 2013. Modeling behavioral factors in interactive information retrieval. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2297--2302.Google ScholarGoogle Scholar
  8. Marcia J Bates. 1989. The design of browsing and berrypicking techniques for the online search interface. Online review (1989).Google ScholarGoogle Scholar
  9. Nicholas J Belkin, Robert N Oddy, and HelenMBrooks. 1982. ASK for information retrieval: Part I. Background and theory. Journal of documentation 38, 2 (1982), 61--71.Google ScholarGoogle ScholarCross RefCross Ref
  10. Nilavra Bhattacharya and Jacek Gwizdka. 2018. Relating eye-tracking measures with changes in knowledge on search tasks. In Proceedings of the 2018 ACM symposium on eye tracking research & applications. 1--5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alexey Borisov, Ilya Markov, Maarten De Rijke, and Pavel Serdyukov. 2016. A neural click model for web search. In Proceedings of the 25th International Conference on World Wide Web. 531--541.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Timo Breuer, Norbert Fuhr, and Philipp Schaer. 2022. Validating Simulations of User Query Variants. In Advances in Information Retrieval, Matthias Hagen, Suzan Verberne, Craig Macdonald, Christin Seifert, Krisztian Balog, Kjetil Nørvåg, and Vinay Setty (Eds.). Springer International Publishing, Cham, 80--94.Google ScholarGoogle Scholar
  13. Timo Breuer, Norbert Fuhr, and Philipp Schaer. 2022. Validating Simulations of User Query Variants. arXiv preprint arXiv:2201.07620 (2022).Google ScholarGoogle Scholar
  14. Arthur Câmara, David Maxwell, and Claudia Hauff. 2022. Searching, Learning, and Subtopic Ordering: A Simulation-based Analysis. arXiv preprint arXiv:2201.11181 (2022).Google ScholarGoogle Scholar
  15. Ben Carterette, Ashraf Bah, and Mustafa Zengin. 2015. Dynamic test collections for retrieval evaluation. In Proceedings of the 2015 international conference on the theory of information retrieval. ACM, 91--100.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ben Carterette, Evangelos Kanoulas, Mark Hall, and Paul Clough. 2014. Overview of the TREC 2014 session track. Technical Report. DELAWARE UNIV NEWARK DEPT OF COMPUTER AND INFORMATION SCIENCES.Google ScholarGoogle Scholar
  17. Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click models for web search. Synthesis Lectures on Information Concepts, Retrieval, and Services 7, 3 (2015), 1--115.Google ScholarGoogle ScholarCross RefCross Ref
  18. Michael D Cooper. 1973. A simulation model of an information retrieval system. Information Storage and Retrieval 9, 1 (1973), 13--32.Google ScholarGoogle ScholarCross RefCross Ref
  19. Carsten Eickhoff, Jaime Teevan, Ryen White, and Susan Dumais. 2014. Lessons from the journey: a query log analysis of within-session learning. In Proceedings of the 7th ACM international conference on Web search and data mining. 223--232.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David Ellis. 1993. Modeling the information-seeking patterns of academic researchers: A grounded theory approach. The Library Quarterly 63, 4 (1993), 469--486.Google ScholarGoogle ScholarCross RefCross Ref
  21. Artem Grotov, Aleksandr Chuklin, Ilya Markov, Luka Stout, Finde Xumara, and Maarten de Rijke. 2015. A comparative study of click models for web search. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 78--90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dongyi Guan, Sicong Zhang, and Hui Yang. 2013. Utilizing query change for session search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 453--462.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In Proceedings of the second acm international conference on web search and data mining. 124--131.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yunlong He, Jiliang Tang, Hua Ouyang, Changsung Kang, Dawei Yin, and Yi Chang. 2016. Learning to rewrite queries. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1443--1452.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Amaç Herdagdelen, Massimiliano Ciaramita, Daniel Mahler, Maria Holmqvist, Keith Hall, Stefan Riezler, and Enrique Alfonseca. 2010. Generalized syntactic and semantic models of query reformulation. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 283--290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Peter Ingwersen. 1996. Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of documentation 52, 1 (1996), 3--50.Google ScholarGoogle ScholarCross RefCross Ref
  27. Peter Ingwersen. 2005. Integrative framework for information seeking and interactive information retrieval. na.Google ScholarGoogle Scholar
  28. Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web. 387--396.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Chris Jordan, Carolyn Watters, and Qigang Gao. 2006. Using controlled query generation to evaluate blind relevance feedback algorithms. In Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries. 286--295.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Heikki Keskustalo, Kalervo Järvelin, Ari Pirkola, Tarun Sharma, and Marianne Lykke. 2009. Test collection-based IR evaluation needs extension toward sessions--a case of extremely short queries. In Asia Information Retrieval Symposium. Springer, 63--74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Carol Collier Kuhlthau. 1988. Developing a model of the library search process: Cognitive and affective aspects. Rq (1988), 232--242.Google ScholarGoogle Scholar
  32. Sahiti Labhishetty and Chengxiang Zhai. 2021. An Exploration of Tester-based Evaluation of User Simulators for Comparing Interactive Retrieval Systems.. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1598--1602.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sahiti Labhishetty and ChengXiang Zhai. 2022. RATE: A Reliability-Aware Tester-Based Evaluation Framework of User Simulators. In European Conference on Information Retrieval. Springer, 336--350.Google ScholarGoogle Scholar
  34. Sahiti Labhishetty, Chengxiang Zhai, Suhas Ranganath, and Pradeep Ranganathan. 2020. A Cognitive User Model for E-Commerce Search. In Proceedings of the Data Science for Retail and E-Commerce Workshop.Google ScholarGoogle Scholar
  35. David Maxwell and Leif Azzopardi. 2016. Agents, simulated users and humans: An analysis of performance and behaviour. In Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, 731--740.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. David Maxwell and Leif Azzopardi. 2016. Simulating interactive information retrieval: Simiir: A framework for the simulation of interaction. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 1141--1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Felipe Moraes, Sindunuraga Rikarno Putra, and Claudia Hauff. 2018. Contrasting search as a learning activity with instructor-designed learning. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 167--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Peter Norvig. 2008. Natural Language Corpus Data: Beautiful Data. http://norvig.com/ngrams/Google ScholarGoogle Scholar
  39. Heather L O'Brien, Andrea Kampen, AmeliaWCole, and Kathleen Brennan. 2020. The role of domain knowledge in search as learning. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 313--317.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Teemu Pääkkönen, Jaana Kekäläinen, Heikki Keskustalo, Leif Azzopardi, David Maxwell, and Kalervo Järvelin. 2017. Validating simulated interaction for retrieval evaluation. Information Retrieval Journal 20, 4 (2017), 338--362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review 106, 4 (1999), 643.Google ScholarGoogle Scholar
  42. Alexandre Salle, Shervin Malmasi, Oleg Rokhlenko, and Eugene Agichtein. 2021. Studying the Effectiveness of Conversational Search Refinement Through User Simulation. In European Conference on Information Retrieval. Springer, 587--602.Google ScholarGoogle Scholar
  43. Jost Schatzmann, Blaise Thomson, Karl Weilhammer, Hui Ye, and Steve Young. 2007. Agenda-based user simulation for bootstrapping a POMDP dialogue system. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. 149--152.Google ScholarGoogle Scholar
  44. Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).Google ScholarGoogle Scholar
  45. Jean Tague, Michael Nelson, and Harry Wu. 1980. Problems in the simulation of bibliographic retrieval systems. In Proceedings of the 3rd annual ACM conference on Research and development in information retrieval. 236--255.Google ScholarGoogle Scholar
  46. Christophe Van Gysel, Evangelos Kanoulas, and Maarten de Rijke. 2017. Pyndri: a Python Interface to the Indri Search Engine. In ECIR, Vol. 2017. Springer.Google ScholarGoogle Scholar
  47. Suzan Verberne, Maya Sappelli, Kalervo Järvelin, and Wessel Kraaij. 2015. User simulations for interactive search: Evaluating personalized query suggestion. In European Conference on Information Retrieval. Springer, 678--690.Google ScholarGoogle ScholarCross RefCross Ref
  48. Barbara M Wildemuth. 2004. The effects of domain knowledge on search tactic formulation. Journal of the american society for information science and technology 55, 3 (2004), 246--258.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Hui Yang, Dongyi Guan, and Sicong Zhang. 2015. The query change model: Modeling session search as a markov decision process. ACM Transactions on Information Systems (TOIS) 33, 4 (2015), 1--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Shuo Zhang and Krisztian Balog. 2020. Evaluating conversational recommender systems via user simulation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1512--1520.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yinan Zhang, Xueqing Liu, and ChengXiang Zhai. 2017. Information retrieval evaluation as search simulation: A general formal framework for ir evaluation. In ACM ICTIR. ACM, 193--200.Google ScholarGoogle Scholar

Index Terms

  1. PRE: A Precision-Recall-Effort Optimization Framework for Query Simulation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval
        August 2022
        289 pages
        ISBN:9781450394123
        DOI:10.1145/3539813

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 August 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ICTIR '22 Paper Acceptance Rate32of80submissions,40%Overall Acceptance Rate209of482submissions,43%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader