skip to main content
research-article

Effects of position and number of relevant documents retrieved on users' evaluations of system performance

Published: 10 June 2010 Publication History

Abstract

Information retrieval research has demonstrated that system performance does not always correlate positively with user performance, and that users often assign positive evaluation scores to search systems even when they are unable to complete tasks successfully. This research investigated the relationship between objective measures of system performance and users' perceptions of that performance. In this study, subjects evaluated the performance of four search systems whose search results were manipulated systematically to produce different orderings and numbers of relevant documents. Three laboratory studies were conducted with a total of eighty-one subjects. The first two studies investigated the effect of the order of five relevant and five nonrelevant documents in a search results list containing ten results on subjects' evaluations. The third study investigated the effect of varying the number of relevant documents in a search results list containing ten results on subjects' evaluations. Results demonstrate linear relationships between subjects' evaluations and the position of relevant documents in a search results list and the total number of relevant documents retrieved. Of the two, number of relevant documents retrieved was a stronger predictor of subjects' evaluation ratings and resulted in subjects using a greater range of evaluation scores.

References

[1]
Allan, J. 2006. HARD Track overview in TREC 2005 high accuracy retrieval from documents. In Proceedings of the Text Retrieval Conference (TREC-2005). E. M. Voorhees and L. P. Buckland, Eds. Government Printing Office, Washington, D.C.
[2]
Allan, J., Carterette, B., and Lewis, J. 2005. When will information retrieval be ‘good enough’? In Proceedings of the 28th Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 433--440.
[3]
Al-Maskari, A., Sanderson, M., and Clough, P. 2007. The relationship between IR effectiveness measures and user satisfaction. In Proceedings of the 30th Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 773--774.
[4]
Bar-Ilan, J., Keenoy, K., Yaari, E., and Levene, M. 2007. User rankings of search engine results. J. Amer. Soc. Inform. Sci. Tech. 58, 9, 1254--1266.
[5]
Blair, D. C. and Maron, M. E. 1985. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Comm. ACM, 28, 3, 289--299.
[6]
Borlund, P. 2003a. The IIR evaluation model: A framework for evaluation of interactive information retrieval systems. Inform. Res. 8, 3, no. 152.
[7]
Borlund, P. 2003b. The concept of relevance in IR. J. Ameri. Soc. Inform. Sci. Tech. 54, 10, 913--925.
[8]
Chen, H. and Dumais, S. 2000. Bringing order to the Web: Automatically categorizing search results. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 145--152.
[9]
Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences 2nd Ed. Lawrence Earlbaum Associates, Hillsdale, NJ.
[10]
Cutrell, E. and Guan, Z. 2007. What are you looking for? An eye-tracking study of information usage in Web search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (SIGCHI). 407--416.
[11]
Dumais, S. T. and Belkin, N. J. 2005. The TREC interactive tracks: Putting the user into search. In TREC: Experiment and Evaluation in Information Retrieval. E. M. Voorhees and D. K. Harman Eds. MIT Press, 123--153.
[12]
Fox, S., Karnawat, K., Mydland, M., Dumais, S., and White, T. 2005. Evaluating implicit measures to improve Web search. ACM Trans. Inform. Syst. 23, 2, 147--168.
[13]
Hersh, W., Turpin, A., Price, S., Chan, B., Kraemer, D., Sacherek, L., and Olson, D. 2000. Do batch and user evaluations give the same results? In Proceedings of the 23rd Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 17--24.
[14]
Hornbæk, K. and Law, E. L.-C. 2007. Meta-analysis of correlations among usability measures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (SIGCHI). 617--626.
[15]
Huffman, S. B. and Hochster, M. 2007. How well does result relevance predict session satisfaction? In Proceedings of 30th Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 567--573.
[16]
Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., and Gay, G. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search. ACM Trans. Inform. Syst. 25, 2.
[17]
Kaki, M. and Aula, A. 2008. Controlling the complexity in comparing search user interfaces via user studies. Inform. Proc. Manag. 44, 1, 82--91.
[18]
Kelly, D., Shah, C., Sugimoto, C. R., Bailey, E. W., Clemens, R. A., Irvine, A. K., Johnson, N. A., Ke, W., Oh, S., Poljakova, A., Rodriguez, M. A., Van Noord, M. G., and Zhang, Y. 2008. Effects of performance feedback on users' evaluations of an interactive IR system. In Proceedings of the 2nd Symposium on Information Interaction in Context (IIiX). 75--82.
[19]
Lee, H.-J., Belkin, N. J., and Krovetz, B. 2006. Rutgers information retrieval performance evaluation project. J. Korean Soc. Inform. Manag., 23, 2, 98--111.
[20]
Nielsen, J. and Levy, J. 1994. Measuring usability: Preference vs. performance. Comm. ACM, 37, 4, 66--75.
[21]
Spink, A. 2002. A user-centered approach to evaluating human interaction with Web search engines: An exploratory study. Inform. Proc. Manag. 38, 401--426.
[22]
Spink, A. and Jansen, B. J. 2004. Web Search: Public Searching of the Web. Kluwer Academic Publishers.
[23]
SU, L. T. 2003. A comprehensive and systematic model of user evaluation of Web search engines: II. An evaluation by undergraduates. J. Amer. Soc. Inform. Sci. Tech. 54, 13, 1193--1223.
[24]
Thomas, P. and Hawking, D. 2006. Evaluation by comparing result sets in context. In Proceedings of the Conference on Information and Knowledge Management (CIKM). 94--101.
[25]
Toms, E. G., Freund, L., and LI, C. 2004. WiIRE: The Web interactive information retrieval experimentation system prototype. Inform. Proc. Manag. 40, 4, 655--675.
[26]
Turpin, A. and Hersh, W. 2001. Why batch and user evaluations do not give the same results. In Proceedings of the 24th Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 225--231.
[27]
Turpin, A. and Scholer, F. 2006. User performance versus precision measures for simple search tasks. In Proceedings of the 29th Annual ACM International Conference on Research and Development in Information Retrieval (SIGIR). 11--18.
[28]
Voorhees, E. M. and Harman, D. K. 2005. TREC: Experiment and Evaluation in Information Retrieval, MIT Press, Cambridge, MA.

Cited By

View all
  • (2021)System Performance and Empathetic Design Enhance User Experience for Fault Diagnosis Expert SystemEngineering Psychology and Cognitive Ergonomics10.1007/978-3-030-77932-0_28(357-367)Online publication date: 24-Jul-2021
  • (2019)Neural Check-Worthiness Ranking with Weak Supervision: Finding Sentences for Fact-CheckingCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316736(994-1000)Online publication date: 13-May-2019
  • (2017)Evaluating Real Patent Retrieval EffectivenessCurrent Challenges in Patent Information Retrieval10.1007/978-3-662-53817-3_5(143-162)Online publication date: 26-Mar-2017
  • Show More Cited By

Index Terms

  1. Effects of position and number of relevant documents retrieved on users' evaluations of system performance

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 28, Issue 2
    May 2010
    165 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/1740592
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 June 2010
    Accepted: 01 June 2009
    Revised: 01 January 2009
    Received: 01 June 2007
    Published in TOIS Volume 28, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Search performance
    2. precision
    3. presentation of search results
    4. ranking
    5. satisfaction
    6. user evaluation of performance

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)System Performance and Empathetic Design Enhance User Experience for Fault Diagnosis Expert SystemEngineering Psychology and Cognitive Ergonomics10.1007/978-3-030-77932-0_28(357-367)Online publication date: 24-Jul-2021
    • (2019)Neural Check-Worthiness Ranking with Weak Supervision: Finding Sentences for Fact-CheckingCompanion Proceedings of The 2019 World Wide Web Conference10.1145/3308560.3316736(994-1000)Online publication date: 13-May-2019
    • (2017)Evaluating Real Patent Retrieval EffectivenessCurrent Challenges in Patent Information Retrieval10.1007/978-3-662-53817-3_5(143-162)Online publication date: 26-Mar-2017
    • (2016)A study of factuality, objectivity and relevanceProceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies10.1145/3006299.3006315(107-117)Online publication date: 6-Dec-2016
    • (2015)An empirical evaluation of the User Engagement Scale (UES) in online news environmentsInformation Processing and Management: an International Journal10.1016/j.ipm.2015.03.00351:4(413-427)Online publication date: 1-Jul-2015
    • (2014)Relevance behaviour in TRECJournal of Documentation10.1108/JD-02-2014-003170:6(1098-1117)Online publication date: 7-Oct-2014
    • (2012)Multidimensional relevanceInformation Processing and Management: an International Journal10.1016/j.ipm.2011.07.00148:2(340-357)Online publication date: 1-Mar-2012
    • (2011)Information Retrieval EvaluationSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00368ED1V01Y201105ICR0193:2(1-119)Online publication date: 31-May-2011
    • (2011)Retrieving relevant information: traditional file systems versus taggingJournal of Enterprise Information Management10.1108/1741039121119217025:1(79-93)Online publication date: 30-Dec-2011
    • (2011)Evaluating Real Patent Retrieval EffectivenessCurrent Challenges in Patent Information Retrieval10.1007/978-3-642-19231-9_6(125-143)Online publication date: 2011
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media