skip to main content
10.1145/3336191.3371799acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Metrics, User Models, and Satisfaction

Published: 22 January 2020 Publication History

Abstract

User satisfaction is an important factor when evaluating search systems, and hence a good metric should give rise to scores that have a strong positive correlation with user satisfaction ratings. A metric should also correspond to a plausible user model, and hence provide a tangible manifestation of how users interact with search rankings. Recent work has focused on metrics whose user models accurately portray the behavior of search engine users. Here we investigate whether those same metrics then also correlate with user satisfaction. We carry out experiments using various classes of metrics, and confirm through the lens of the C/W/L framework that the metrics with user models that reflect typical behavior also tend to be the metrics that correlate well with user satisfaction ratings.

References

[1]
A. Al-Maskari and M. Sanderson. A review of factors influencing user satisfaction in information retrieval. J. Amer. Soc. Inf. Sc. Tech., 61 (5): 859--868, 2010.
[2]
A. Al-Maskari, M. Sanderson, and P. Clough. The relationship between IR effectiveness measures and user satisfaction. In Proc. SIGIR, pages 773--774, 2007.
[3]
L. Azzopardi, P. Thomas, and N. Craswell. Measuring the utility of search engine result pages. In Proc. SIGIR, pages 605--614, 2018.
[4]
B. Carterette. System effectiveness, user models, and user utility: A conceptual framework for investigation. In Proc. SIGIR, pages 903--912, 2011.
[5]
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proc. CIKM, pages 621--630, 2009.
[6]
Y. Chen, K. Zhou, Y. Liu, M. Zhang, and S. Ma. Meta-evaluation of online and offline web search evaluation metrics. In Proc. SIGIR, pages 15--24, 2017.
[7]
W. S. Cooper. On selecting a measure of retrieval effectiveness. J. Amer. Soc. Inf. Sci., 24 (2): 87--100, 1973.
[8]
E. Cutrell and Z. Guan. What are you looking for? An eye-tracking study of information usage in web search. In Proc. CHI, pages 407--416, 2007.
[9]
S. Fox, K. Karnawat, M. Mydland, S. Dumais, and T. White. Evaluating implicit measures to improve web search. ACM Trans. Inf. Sys., 23 (2): 147--168, 2005.
[10]
S. P. Harter and C. A. Hert. Evaluation of information retrieval systems: Approaches, issues, and methods. Annual Review of Information Science and Technology (ARIST), 32: 3--94, 1997.
[11]
A. Hassan, R. Jones, and K. L. Klinkner. Beyond DCG: User behavior as a predictor of a successful search. In Proc. WSDM, pages 221--230, 2010.
[12]
C. R. Hildreth. Accounting for users' inflated assessments of on-line catalogue search performance and usefulness: An experimental study. Information Research, 6 (2), 2001.
[13]
H. Hotelling. The selection of variates for use in prediction with some comments on the general problem of nuisance parameters. The Annals of Mathematical Statistics, 11 (3): 271--283, 1940.
[14]
S. B. Huffman and M. Hochster. How well does result relevance predict session satisfaction? In Proc. SIGIR, pages 567--574, 2007.
[15]
elin and Kek"al"ainen(2002)]jarvelindcgK. J"arvelin and J. Kek"al"ainen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Sys., 20 (4): 422--446, 2002.
[16]
elin et al.(2008)J"arvelin, Price, Delcambre, and Nielsen]Jarvelin:2008:DCG:1793274.1793280K. J"arvelin, S. L. Price, L. M. Delcambre, and M. L. Nielsen. Discounted cumulated gain based evaluation of multiple-query IR sessions. In Proc. ECIR, pages 4--15, 2008.
[17]
J. Jiang and J. Allan. Correlation between system and user metrics in a session. In Proc. CHIIR, pages 285--288, 2016.
[18]
J. Jiang and J. Allan. Adaptive persistence for search effectiveness measures. In Proc. CIKM, pages 747--756, 2017.
[19]
J. Jiang, A. H. Awadallah, X. Shi, and R. W. White. Understanding and predicting graded search satisfaction. In Proc. WSDM, pages 57--66, 2015.
[20]
T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proc. SIGIR, pages 154--161, 2005.
[21]
R. Jones and K. L. Klinkner. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In Proc. CIKM, pages 699--708, 2008.
[22]
E. Kanoulas, B. Carterette, P. D. Clough, and M. Sanderson. Overview of the TREC 2010 session track. In Proc. TREC, 2010.
[23]
E. Kanoulas, B. Carterette, P. D. Clough, and M. Sanderson. Evaluating multi-query sessions. In Proc. SIGIR, pages 1053--1062, 2011.
[24]
D. Kelly. Methods for evaluating interactive information retrieval systems with users. Foundation and Trends in IR, 3 (1&2): 1--224, 2009.
[25]
Liu, Liu, Mao, Luo, and Ma]Liu:2018:TDB:3209978.3210097M. Liu, Y. Liu, J. Mao, C. Luo, and S. Ma. Towards designing better session search evaluation metrics. In Proc. SIGIR, pages 1121--1124, 2018 a .
[26]
Liu, Liu, Mao, Luo, Zhang, and Ma]Liu:2018:SFU:3178876.3186065M. Liu, Y. Liu, J. Mao, C. Luo, M. Zhang, and S. Ma. Satisfaction with failure or unsatisfied success: Investigating the relationship between search success and user satisfaction. In Proc. WWW, pages 1533--1542, 2018 b .
[27]
J. Mao, Y. Liu, K. Zhou, J. Nie, J. Song, M. Zhang, S. Ma, J. Sun, and H. Luo. When does relevance mean usefulness and user satisfaction in web search? In Proc. SIGIR, pages 463--472, 2016.
[28]
A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Sys., 27 (1): 2.1--2.27, 2008.
[29]
A. Moffat, P. Thomas, and F. Scholer. Users versus models: What observation tells us about effectiveness metrics. In Proc. CIKM, pages 659--668, 2013.
[30]
A. Moffat, P. Bailey, F. Scholer, and P. Thomas. Incorporating user expectations and behavior into the measurement of search effectiveness. ACM Trans. Inf. Sys., 35 (3): 24:1--24:38, 2017.
[31]
F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In Proc. CIKM, pages 43--52, 2008.
[32]
M. Sanderson. Test collection based evaluation of information retrieval systems. Foundation and Trends in IR, 4 (4): 247--375, 2010.
[33]
M. D. Smucker and C. L. A. Clarke. Time-based calibration of effectiveness measures. In Proc. SIGIR, pages 95--104, 2012.
[34]
M. D. Smucker and C. P. Jethani. Human performance and retrieval precision revisited. In Proc. SIGIR, pages 595--602, 2010.
[35]
D. Soergel. Is user satisfaction a hobgoblin? J. Amer. Soc. Inf. Sci., 27 (4): 256--259, 1976.
[36]
k-Jones(1981)]jones1981informationK. Sp"arck-Jones. Information Retrieval Experiment. Butterworths, 1981.
[37]
L. T. Su. Evaluation measures for interactive information retrieval. Inf. Proc. & Man., 28 (4): 503--516, 1992.
[38]
P. Thomas, F. Scholer, and A. Moffat. What users do: The eyes have it. In Proc. Asia Info. Retri. Soc. Conf., pages 416--427, 2013.
[39]
P. Thomas, A. Moffat, P. Bailey, F. Scholer, and N. Craswell. Better effectiveness metrics for SERPs, cards, and rankings. In Proc. Aust. Doc. Comp. Symp., pages 1:1--1:8, 2018.
[40]
A. F. Wicaksono and A. Moffat. Empirical evidence for search effectiveness models. In Proc. CIKM, pages 1571--1574, 2018.
[41]
A. F. Wicaksono, A. Moffat, and J. Zobel. Modeling user actions in job search. In Proc. ECIR, pages 652--664, 2019.
[42]
F. Zhang, Y. Liu, X. Li, M. Zhang, Y. Xu, and S. Ma. Evaluating web search with a bejeweled player model. In Proc. SIGIR, pages 425--434, 2017.
[43]
Y. Zhang, L. A. F. Park, and A. Moffat. Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retr., 13 (1): 46--69, 2010.

Cited By

View all
  • (2024)Investigating Users' Search Behavior and Outcome with ChatGPT in Learning-oriented Search TasksProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698406(103-113)Online publication date: 8-Dec-2024
  • (2024)What Matters in a Measure? A Perspective from Large-Scale Search EvaluationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657845(282-292)Online publication date: 10-Jul-2024
  • (2024)Individual Persistence Adaptation for User-Centric Evaluation of User Satisfaction in Recommender SystemsIEEE Access10.1109/ACCESS.2024.336069312(23626-23635)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining
January 2020
950 pages
ISBN:9781450368223
DOI:10.1145/3336191
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 January 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. effectiveness metric
  2. evaluation
  3. session
  4. user model
  5. web search

Qualifiers

  • Research-article

Funding Sources

Conference

WSDM '20

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)5
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Investigating Users' Search Behavior and Outcome with ChatGPT in Learning-oriented Search TasksProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698406(103-113)Online publication date: 8-Dec-2024
  • (2024)What Matters in a Measure? A Perspective from Large-Scale Search EvaluationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657845(282-292)Online publication date: 10-Jul-2024
  • (2024)Individual Persistence Adaptation for User-Centric Evaluation of User Satisfaction in Recommender SystemsIEEE Access10.1109/ACCESS.2024.336069312(23626-23635)Online publication date: 2024
  • (2024)An Intrinsic Framework of Information Retrieval Evaluation MeasuresIntelligent Systems and Applications10.1007/978-3-031-47721-8_47(692-713)Online publication date: 10-Jan-2024
  • (2023)A Reference-Dependent Model for Web Search EvaluationProceedings of the ACM Web Conference 202310.1145/3543507.3583551(3396-3405)Online publication date: 30-Apr-2023
  • (2023)Constructing and meta-evaluating state-aware evaluation metrics for interactive search systemsInformation Retrieval10.1007/s10791-023-09426-126:1-2Online publication date: 31-Oct-2023
  • (2022)When Measurement MisleadsACM SIGIR Forum10.1145/3582524.358254056:1(1-20)Online publication date: 1-Jun-2022
  • (2022)Users: Can't Work With Them, Can't Work Without Them?Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532787(1-1)Online publication date: 6-Jul-2022
  • (2022)Constructing Better Evaluation Metrics by Incorporating the Anchoring Effect into the User ModelProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531953(2709-2714)Online publication date: 6-Jul-2022
  • (2022)Batch Evaluation Metrics in Information Retrieval: Measures, Scales, and MeaningIEEE Access10.1109/ACCESS.2022.321166810(105564-105577)Online publication date: 2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media