ABSTRACT
We study how to develop an interpretable query simulation framework that can potentially explain the process a real user might have used to formulate a query and propose a novel interpretable optimization framework (PRE) for simulating query formulation and reformulation uniformly based on a user's knowledge state, where the three high-level objectives are to maximize the precision and recall of the anticipated retrieval results and minimize the user effort. We propose probabilistic models to model how a user might estimate precision and recall for a candidate query and derive multiple specific query formulation algorithms. Evaluation results show that the major assumptions made in the PRE framework appear to be reasonable, matching the observed empirical result patterns. PRE provides specific hypotheses about a user's query formulation process that can be further examined via user studies, enables simulation of meaningful variations of users without requiring extra training data, and serves as a roadmap for systematic exploration and derivation of new interpretable query simulation methods.
- Leif Azzopardi. 2011. The economics in interactive information retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 15--24.Google ScholarDigital Library
- Leif Azzopardi. 2014. Modelling interaction with economic models of search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 3--12.Google ScholarDigital Library
- Leif Azzopardi, Maarten De Rijke, and Krisztian Balog. 2007. Building simulated queries for known-item topics: an analysis using six european languages. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 455--462.Google ScholarDigital Library
- Krisztian Balog, David Maxwell, Paul Thomas, and Shuo Zhang. 2022. Report on the 1st simulation for information retrieval workshop (Sim4IR 2021) at SIGIR 2021. In ACM SIGIR Forum, Vol. 55. ACM New York, NY, USA, 1--16.Google Scholar
- F. Baskaya. 2014. Simulating Search Sessions in Interactive Information Retrieval Evaluation. PhD thesis, University of Tempere.Google Scholar
- F. Baskaya, H. Keskustalo, and K. Jarvelin. 2011. Simulating simple and fallible relevance feedback. In Proceedings of ECIR.Google Scholar
- Feza Baskaya, Heikki Keskustalo, and Kalervo Järvelin. 2013. Modeling behavioral factors in interactive information retrieval. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2297--2302.Google Scholar
- Marcia J Bates. 1989. The design of browsing and berrypicking techniques for the online search interface. Online review (1989).Google Scholar
- Nicholas J Belkin, Robert N Oddy, and HelenMBrooks. 1982. ASK for information retrieval: Part I. Background and theory. Journal of documentation 38, 2 (1982), 61--71.Google ScholarCross Ref
- Nilavra Bhattacharya and Jacek Gwizdka. 2018. Relating eye-tracking measures with changes in knowledge on search tasks. In Proceedings of the 2018 ACM symposium on eye tracking research & applications. 1--5.Google ScholarDigital Library
- Alexey Borisov, Ilya Markov, Maarten De Rijke, and Pavel Serdyukov. 2016. A neural click model for web search. In Proceedings of the 25th International Conference on World Wide Web. 531--541.Google ScholarDigital Library
- Timo Breuer, Norbert Fuhr, and Philipp Schaer. 2022. Validating Simulations of User Query Variants. In Advances in Information Retrieval, Matthias Hagen, Suzan Verberne, Craig Macdonald, Christin Seifert, Krisztian Balog, Kjetil Nørvåg, and Vinay Setty (Eds.). Springer International Publishing, Cham, 80--94.Google Scholar
- Timo Breuer, Norbert Fuhr, and Philipp Schaer. 2022. Validating Simulations of User Query Variants. arXiv preprint arXiv:2201.07620 (2022).Google Scholar
- Arthur Câmara, David Maxwell, and Claudia Hauff. 2022. Searching, Learning, and Subtopic Ordering: A Simulation-based Analysis. arXiv preprint arXiv:2201.11181 (2022).Google Scholar
- Ben Carterette, Ashraf Bah, and Mustafa Zengin. 2015. Dynamic test collections for retrieval evaluation. In Proceedings of the 2015 international conference on the theory of information retrieval. ACM, 91--100.Google ScholarDigital Library
- Ben Carterette, Evangelos Kanoulas, Mark Hall, and Paul Clough. 2014. Overview of the TREC 2014 session track. Technical Report. DELAWARE UNIV NEWARK DEPT OF COMPUTER AND INFORMATION SCIENCES.Google Scholar
- Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click models for web search. Synthesis Lectures on Information Concepts, Retrieval, and Services 7, 3 (2015), 1--115.Google ScholarCross Ref
- Michael D Cooper. 1973. A simulation model of an information retrieval system. Information Storage and Retrieval 9, 1 (1973), 13--32.Google ScholarCross Ref
- Carsten Eickhoff, Jaime Teevan, Ryen White, and Susan Dumais. 2014. Lessons from the journey: a query log analysis of within-session learning. In Proceedings of the 7th ACM international conference on Web search and data mining. 223--232.Google ScholarDigital Library
- David Ellis. 1993. Modeling the information-seeking patterns of academic researchers: A grounded theory approach. The Library Quarterly 63, 4 (1993), 469--486.Google ScholarCross Ref
- Artem Grotov, Aleksandr Chuklin, Ilya Markov, Luka Stout, Finde Xumara, and Maarten de Rijke. 2015. A comparative study of click models for web search. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 78--90.Google ScholarDigital Library
- Dongyi Guan, Sicong Zhang, and Hui Yang. 2013. Utilizing query change for session search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 453--462.Google ScholarDigital Library
- Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In Proceedings of the second acm international conference on web search and data mining. 124--131.Google ScholarDigital Library
- Yunlong He, Jiliang Tang, Hua Ouyang, Changsung Kang, Dawei Yin, and Yi Chang. 2016. Learning to rewrite queries. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 1443--1452.Google ScholarDigital Library
- Amaç Herdagdelen, Massimiliano Ciaramita, Daniel Mahler, Maria Holmqvist, Keith Hall, Stefan Riezler, and Enrique Alfonseca. 2010. Generalized syntactic and semantic models of query reformulation. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 283--290.Google ScholarDigital Library
- Peter Ingwersen. 1996. Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of documentation 52, 1 (1996), 3--50.Google ScholarCross Ref
- Peter Ingwersen. 2005. Integrative framework for information seeking and interactive information retrieval. na.Google Scholar
- Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web. 387--396.Google ScholarDigital Library
- Chris Jordan, Carolyn Watters, and Qigang Gao. 2006. Using controlled query generation to evaluate blind relevance feedback algorithms. In Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries. 286--295.Google ScholarDigital Library
- Heikki Keskustalo, Kalervo Järvelin, Ari Pirkola, Tarun Sharma, and Marianne Lykke. 2009. Test collection-based IR evaluation needs extension toward sessions--a case of extremely short queries. In Asia Information Retrieval Symposium. Springer, 63--74.Google ScholarDigital Library
- Carol Collier Kuhlthau. 1988. Developing a model of the library search process: Cognitive and affective aspects. Rq (1988), 232--242.Google Scholar
- Sahiti Labhishetty and Chengxiang Zhai. 2021. An Exploration of Tester-based Evaluation of User Simulators for Comparing Interactive Retrieval Systems.. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1598--1602.Google ScholarDigital Library
- Sahiti Labhishetty and ChengXiang Zhai. 2022. RATE: A Reliability-Aware Tester-Based Evaluation Framework of User Simulators. In European Conference on Information Retrieval. Springer, 336--350.Google Scholar
- Sahiti Labhishetty, Chengxiang Zhai, Suhas Ranganath, and Pradeep Ranganathan. 2020. A Cognitive User Model for E-Commerce Search. In Proceedings of the Data Science for Retail and E-Commerce Workshop.Google Scholar
- David Maxwell and Leif Azzopardi. 2016. Agents, simulated users and humans: An analysis of performance and behaviour. In Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, 731--740.Google ScholarDigital Library
- David Maxwell and Leif Azzopardi. 2016. Simulating interactive information retrieval: Simiir: A framework for the simulation of interaction. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 1141--1144.Google ScholarDigital Library
- Felipe Moraes, Sindunuraga Rikarno Putra, and Claudia Hauff. 2018. Contrasting search as a learning activity with instructor-designed learning. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 167--176.Google ScholarDigital Library
- Peter Norvig. 2008. Natural Language Corpus Data: Beautiful Data. http://norvig.com/ngrams/Google Scholar
- Heather L O'Brien, Andrea Kampen, AmeliaWCole, and Kathleen Brennan. 2020. The role of domain knowledge in search as learning. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 313--317.Google ScholarDigital Library
- Teemu Pääkkönen, Jaana Kekäläinen, Heikki Keskustalo, Leif Azzopardi, David Maxwell, and Kalervo Järvelin. 2017. Validating simulated interaction for retrieval evaluation. Information Retrieval Journal 20, 4 (2017), 338--362.Google ScholarDigital Library
- Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review 106, 4 (1999), 643.Google Scholar
- Alexandre Salle, Shervin Malmasi, Oleg Rokhlenko, and Eugene Agichtein. 2021. Studying the Effectiveness of Conversational Search Refinement Through User Simulation. In European Conference on Information Retrieval. Springer, 587--602.Google Scholar
- Jost Schatzmann, Blaise Thomson, Karl Weilhammer, Hui Ye, and Steve Young. 2007. Agenda-based user simulation for bootstrapping a POMDP dialogue system. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. 149--152.Google Scholar
- Thibault Sellam, Dipanjan Das, and Ankur P Parikh. 2020. BLEURT: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696 (2020).Google Scholar
- Jean Tague, Michael Nelson, and Harry Wu. 1980. Problems in the simulation of bibliographic retrieval systems. In Proceedings of the 3rd annual ACM conference on Research and development in information retrieval. 236--255.Google Scholar
- Christophe Van Gysel, Evangelos Kanoulas, and Maarten de Rijke. 2017. Pyndri: a Python Interface to the Indri Search Engine. In ECIR, Vol. 2017. Springer.Google Scholar
- Suzan Verberne, Maya Sappelli, Kalervo Järvelin, and Wessel Kraaij. 2015. User simulations for interactive search: Evaluating personalized query suggestion. In European Conference on Information Retrieval. Springer, 678--690.Google ScholarCross Ref
- Barbara M Wildemuth. 2004. The effects of domain knowledge on search tactic formulation. Journal of the american society for information science and technology 55, 3 (2004), 246--258.Google ScholarDigital Library
- Hui Yang, Dongyi Guan, and Sicong Zhang. 2015. The query change model: Modeling session search as a markov decision process. ACM Transactions on Information Systems (TOIS) 33, 4 (2015), 1--33.Google ScholarDigital Library
- Shuo Zhang and Krisztian Balog. 2020. Evaluating conversational recommender systems via user simulation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1512--1520.Google ScholarDigital Library
- Yinan Zhang, Xueqing Liu, and ChengXiang Zhai. 2017. Information retrieval evaluation as search simulation: A general formal framework for ir evaluation. In ACM ICTIR. ACM, 193--200.Google Scholar
Index Terms
- PRE: A Precision-Recall-Effort Optimization Framework for Query Simulation
Recommendations
Validating Simulations of User Query Variants
Advances in Information RetrievalAbstractSystem-oriented IR evaluations are limited to rather abstract understandings of real user behavior. As a solution, simulating user interactions provides a cost-efficient way to support system-oriented experiments with more realistic directives ...
Simulating User Querying Behavior Using Embedding Space Alignment
Linking Theory and Practice of Digital LibrariesAbstractSimulation is used as a cost-efficient and repeatable means of experimentation to support Information Retrieval (IR) systems and digital libraries with more realistic directives when user interaction data is lacking. While simulation has been ...
Synopses for query optimization: A space-complexity perspective
Special Issue: SIGMOD/PODS 2004Database systems use precomputed synopses of data to estimate the cost of alternative plans during query optimization. A number of alternative synopsis structures have been proposed, but histograms are by far the most commonly used. While histograms ...
Comments