skip to main content
10.1145/3477495.3531953acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Constructing Better Evaluation Metrics by Incorporating the Anchoring Effect into the User Model

Published: 07 July 2022 Publication History

Abstract

Models of existing evaluation metrics assume that users are rational decision-makers trying to pursue maximised utility. However, studies in behavioural economics show that people are not always rational when making decisions. Previous studies showed that the anchoring effect can influence the relevance judgement of a document. In this paper, we challenge the rational user assumption and introduce the anchoring effect into user models. We first propose a framework for query-level evaluation metrics by incorporating the anchoring effect into the user model. In the framework, the magnitude of the anchoring effect is related to the quality of the previous document. We then apply our framework to several query-level evaluation metrics and compare them with their vanilla version as the baseline in terms of user satisfaction on a publicly available search dataset. As a result, our Anchoring-aware Metrics (AMs) outperformed their baselines in term of correlation with user satisfaction. The result suggests that we can better predict user query satisfaction feedbacks by incorporating the anchoring effect into user models of existing evaluating metrics. As far as we know, we are the first to introduce the anchoring effect into information retrieval evaluation metrics. Our findings provide a perspective from behavioural economics to better understand user behaviour and satisfaction in search interaction.

Supplementary Material

MP4 File (SIGIR22-sp2255.mp4)
Presentation video.

References

[1]
Ameer Albahem, Damiano Spina, Falk Scholer, and Lawrence Cavedon. 2019. Meta-evaluation of Dynamic Search: How Do Metrics Capture Topical Relevance, Diversity and User Effort? Advances in Information Retrieval 41st European Conference on IR Research, ECIR 2019, Cologne, Germany, April 14--18, 2019, Proceedings, Part I, 14.
[2]
Leif Azzopardi. 2021. Cognitive Biases in Search: A Review and Reflection of Cognitive Biases in Information Retrieval. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval (Canberra ACT, Australia) (CHIIR '21). Association for Computing Machinery, New York, NY, USA, 27--37. https: //doi.org/10.1145/3406522.3446023
[3]
Leif Azzopardi, Joel Mackenzie, and Alistair Moffat. 2021. ERR is Not C/W/L: Exploring the Relationship Between Expected Reciprocal Rank and Other Metrics. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval (Virtual Event, Canada) (ICTIR '21). Association for Computing Machinery, New York, NY, USA, 231--237. https://doi.org/10.1145/3471158.3472239
[4]
Peter Bailey, Alistair Moffat, Falk Scholer, and Paul Thomas. 2015. User Variability and IR System Evaluation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 625--634. https://doi.org/10.1145/2766462.2767728
[5]
Ben Carterette. 2011. System Effectiveness, User Models, and User Utility: A Conceptual Framework for Investigation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (Beijing, China) (SIGIR '11). Association for Computing Machinery, New York, NY, USA, 903--912. https://doi.org/10.1145/2009916.2010037
[6]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (Hong Kong, China) (CIKM '09). Association for Computing Machinery, New York, NY, USA, 621--630. https: //doi.org/10.1145/1645953.1646033
[7]
Ye Chen, Ke Zhou, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Metaevaluation of Online and Offline Web Search Evaluation Metrics. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (2017).
[8]
Tadele T. Damessie, J. Shane Culpepper, Jaewon Kim, and Falk Scholer. 2018. Presentation Ordering Effects On Assessor Agreement. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM '18). Association for Computing Machinery, New York, NY, USA, 723--732. https://doi.org/10.1145/3269206.3271750
[9]
Carsten Eickhoff. 2018. Cognitive Biases in Crowdsourcing. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM '18). Association for Computing Machinery, New York, NY, USA, 162--170. https://doi.org/10.1145/3159652.3159654
[10]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20, 4 (oct 2002), 422--446. https://doi.org/ 10.1145/582415.582418
[11]
Kalervo Jarvelin, Susan L Price, Lois L. M. Delcambre, and Marianne Lykke Nielsen. 2008. Discounted Cumulated Gain based Evaluation of Multiple-Query IR Sessions. In Advances in Information Retrieval: 30th European Conference on IR Research, Ecir 2008, Glasgow, UK, March 30 -- April 3, 2008. 4--15.
[12]
Diane Kelly, Chirag Shah, Cassidy R. Sugimoto, Earl W. Bailey, Rachael A. Clemens, Ann K. Irvine, Nicholas A. Johnson, Weimao Ke, Sanghee Oh, Anezka Poljakova, Marcos A. Rodriguez, Megan G. van Noord, and Yan Zhang. 2008. Effects of Performance Feedback on Users' Evaluations of an Interactive IR System. In Proceedings of the Second International Symposium on Information Interaction in Context (London, United Kingdom) (IIiX '08). Association for Computing Machinery, New York, NY, USA, 75--82. https://doi.org/10.1145/1414694.1414712
[13]
Arie W. Kruglanski and Icek Ajzen. 1983. Bias and error in human judgment. European Journal of Social Psychology 13, 1 (1983), 1--44. https://doi.org/10.1002/ejsp.2420130102 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/ejsp.2420130102
[14]
Annie Y.S. Lau and Enrico W. Coiera. 2007. Do People Experience Cognitive Biases while Searching for Information? Journal of the American Medical Informatics Association 14, 5 (2007), 599--608. https://doi.org/10.1197/jamia.M2411
[15]
Annie Y.S. Lau and Enrico W. Coiera. 2009. Can Cognitive Biases during Consumer Health Information Searches Be Reduced to Improve Decision Making? Journal of the American Medical Informatics Association 16, 1 (01 2009), 54--65. https://doi.org/10.1197/jamia.M2557 arXiv:https://academic.oup.com/jamia/article-pdf/16/1/54/2572282/16--1--54.pdf
[16]
Aldo Lipani, Ben Carterette, and Emine Yilmaz. 2019. From a User Model for Query Sessions to Session Rank Biased Precision (SRBP). In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval (Santa Clara, CA, USA) (ICTIR '19). Association for Computing Machinery, New York, NY, USA, 109--116. https://doi.org/10.1145/3341981.3344216
[17]
Jiqun Liu and Fangyuan Han. 2020. Investigating Reference Dependence Effects on User Search Interaction and Satisfaction: A Behavioral Economics Perspective. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 1141--1150. https://doi.org/10.1145/3397271.3401085
[18]
Mengyang Liu, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2019. Investigating Cognitive Effects in Session-Level Search User Satisfaction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD '19). Association for Computing Machinery, New York, NY, USA, 923--931. https://doi.org/10.1145/3292500.3330981
[19]
Alistair Moffat, Peter Bailey, Falk Scholer, and Paul Thomas. 2015. INST: An Adaptive Metric for Information Retrieval Evaluation. In Proceedings of the 20th Australasian Document Computing Symposium (Parramatta, NSW, Australia) (ADCS '15). Association for Computing Machinery, New York, NY, USA, Article 5, 4 pages. https://doi.org/10.1145/2838931.2838938
[20]
Alistair Moffat, Peter Bailey, Falk Scholer, and Paul Thomas. 2017. Incorporating User Expectations and Behavior into the Measurement of Search Effectiveness. ACM Trans. Inf. Syst. 35, 3, Article 24 (jun 2017), 38 pages. https://doi.org/10. 1145/3052768
[21]
Alistair Moffat, Paul Thomas, and Falk Scholer. 2013. Users versus Models: What Observation Tells Us about Effectiveness Metrics. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (San Francisco, California, USA) (CIKM '13). Association for Computing Machinery, New York, NY, USA, 659--668. https://doi.org/10.1145/2505515.2507665
[22]
Alistair Moffat and Justin Zobel. 2008. Rank-Biased Precision for Measurement of Retrieval Effectiveness. ACM Trans. Inf. Syst. 27, 1, Article 2 (dec 2008), 27 pages. https://doi.org/10.1145/1416950.1416952
[23]
Tetsuya Sakai and Zhicheng Dou. 2013. Summaries, Ranked Retrieval and Sessions: A Unified Framework for Information Access Evaluation. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (Dublin, Ireland) (SIGIR '13). Association for Computing Machinery, New York, NY, USA, 473--482. https://doi.org/10.1145/2484028.2484031
[24]
Mark Sanderson. 2010. Test Collection Based Evaluation of Information Retrieval Systems. Foundations and Trends in Information Retrieval 4 (01 2010), 247--375. https://doi.org/10.1561/1500000009
[25]
Falk Scholer, Diane Kelly, Wan-Ching Wu, Hanseul S. Lee, and William Webber. 2013. The Effect of Threshold Priming and Need for Cognition on Relevance Calibration and Assessment. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (Dublin, Ireland) (SIGIR '13). Association for Computing Machinery, New York, NY, USA, 623--632. https://doi.org/10.1145/2484028.2484090
[26]
Milad Shokouhi, Ryen White, and Emine Yilmaz. 2015. Anchoring and Adjustment in Relevance Estimation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 963--966. https://doi.org/10.1145/2766462.2767841
[27]
Mark D. Smucker and Charles L.A. Clarke. 2012. Time-Based Calibration of Effectiveness Measures. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR '12). Association for Computing Machinery, New York, NY, USA, 95--104. https://doi.org/10.1145/2348283.2348300
[28]
Paul Thomas, Gabriella Kazai, Ryen White, and Nick Craswell. 2022. The Crowd is Made of People: Observations from Large-Scale Crowd Labelling. In ACM SIGIR Conference on Human Information Interaction and Retrieval (Regensburg, Germany) (CHIIR '22). Association for Computing Machinery, New York, NY, USA, 25--35. https://doi.org/10.1145/3498366.3505815
[29]
Amos Tversky and Daniel Kahneman. 1974. Judgment under Uncertainty: Heuristics and Biases. Science 185, 4157 (1974), 1124--1131. https://doi.org/10.1126/science.185.4157.1124 arXiv:https://www.science.org/doi/pdf/10.1126/science.185.4157.1124
[30]
Amos Tversky and Daniel Kahneman. 1991. Loss Aversion in Riskless Choice: A Reference-Dependent Model. Quarterly Journal of Economics 106 (1991), 1039-- 1061.
[31]
Amos Tversky and Daniel Kahneman. 1992. Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty 5 (1992), 297--323.
[32]
Alfan Farizki Wicaksono and Alistair Moffat. 2020. Metrics, User Models, and Satisfaction. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA) (WSDM '20). Association for Computing Machinery, New York, NY, USA, 654--662. https://doi.org/10.1145/3336191.3371799
[33]
Alfan Farizki Wicaksono and Alistair Moffat. 2021. Modeling search and session effectiveness. Information Processing & Management 58, 4 (2021), 102601. https: //doi.org/10.1016/j.ipm.2021.102601
[34]
Alfan Farizki Wicaksono, Alistair Moffat, and Justin Zobel. 2019. Modeling User Actions in Job Search. In ECIR.
[35]
Emine Yilmaz, Milad Shokouhi, Nick Craswell, and Stephen Robertson. 2010. Expected Browsing Utility for Web Search Evaluation. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (Toronto, ON, Canada) (CIKM '10). Association for Computing Machinery, New York, NY, USA, 1561--1564. https://doi.org/10.1145/1871437.1871672
[36]
Fan Zhang, Yiqun Liu, Xin Li, Min Zhang, Yinghui Xu, and Shaoping Ma. 2017. Evaluating Web Search with a Bejeweled Player Model. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku, Tokyo, Japan) (SIGIR '17). Association for Computing Machinery, New York, NY, USA, 425--434. https://doi.org/10.1145/3077136.3080841
[37]
Fan Zhang, Jiaxin Mao, Yiqun Liu, Weizhi Ma, Min Zhang, and Shaoping Ma. 2020. Cascade or Recency: Constructing Better Evaluation Metrics for Session Search. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 389--398. https://doi.org/10.1145/3397271.3401163
[38]
Yuye Zhang, Laurence Anthony F. Park, and Alistair Moffat. 2009. Click-based evidence for decaying weight distributions in search effectiveness metrics. Information Retrieval 13 (2009), 46--69.

Cited By

View all
  • (2025)Unraveling the anchoring effect of seller’s show on buyer’s show to enhance review helpfulness prediction: A multi-granularity attention network model with multimodal informationElectronic Commerce Research and Applications10.1016/j.elerap.2025.10148470(101484)Online publication date: Mar-2025
  • (2024)Decoy Effect in Search Interaction: Understanding User Behavior and Measuring System VulnerabilityACM Transactions on Information Systems10.1145/370888443:2(1-58)Online publication date: 19-Dec-2024
  • (2024)AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance AssessmentProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698420(54-63)Online publication date: 8-Dec-2024
  • Show More Cited By

Index Terms

  1. Constructing Better Evaluation Metrics by Incorporating the Anchoring Effect into the User Model

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. anchoring effect
    2. cognitive bias
    3. evaluation metrics
    4. information retrieval
    5. user behaviour

    Qualifiers

    • Short-paper

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)47
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Unraveling the anchoring effect of seller’s show on buyer’s show to enhance review helpfulness prediction: A multi-granularity attention network model with multimodal informationElectronic Commerce Research and Applications10.1016/j.elerap.2025.10148470(101484)Online publication date: Mar-2025
    • (2024)Decoy Effect in Search Interaction: Understanding User Behavior and Measuring System VulnerabilityACM Transactions on Information Systems10.1145/370888443:2(1-58)Online publication date: 19-Dec-2024
    • (2024)AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance AssessmentProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698420(54-63)Online publication date: 8-Dec-2024
    • (2024)Cognitively Biased Users Interacting with Algorithmically Biased Results in Whole-Session Search on Debated TopicsProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672520(227-237)Online publication date: 2-Aug-2024
    • (2024)What Matters in a Measure? A Perspective from Large-Scale Search EvaluationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657845(282-292)Online publication date: 10-Jul-2024
    • (2024)Understanding users' dynamic perceptions of search gain and cost in sessions: An expectation confirmation modelJournal of the Association for Information Science and Technology10.1002/asi.24935Online publication date: 17-Jun-2024
    • (2023)Behavioural economics theories in information-seeking behaviour research: A systematic reviewJournal of Librarianship and Information Science10.1177/09610006231219246Online publication date: 31-Dec-2023
    • (2023)A Reference-Dependent Model for Web Search EvaluationProceedings of the ACM Web Conference 202310.1145/3543507.3583551(3396-3405)Online publication date: 30-Apr-2023
    • (2023)Investigating the role of in-situ user expectations in Web searchInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10330060:3Online publication date: 1-May-2023
    • (2023)Implications and New Directions for IR Research and PracticesA Behavioral Economics Approach to Interactive Information Retrieval10.1007/978-3-031-23229-9_7(181-201)Online publication date: 18-Feb-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media