research-article

Human performance and retrieval precision revisited

Authors:

Chandra Prakash JethaniAuthors Info & Claims

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 595 - 602

https://doi.org/10.1145/1835449.1835549

Published: 19 July 2010 Publication History

Get Access

Abstract

Several studies have found that the Cranfield approach to evaluation can report significant performance differences between retrieval systems for which little to no performance difference is found for humans completing tasks with these systems. We revisit the relationship between precision and performance by measuring human performance on tightly controlled search tasks and with user interfaces offering limited interaction. We find that human performance and retrieval precision are strongly related. We also find that users change their relevance judging behavior based on the precision of the results. This change in behavior coupled with the well-known lack of perfect inter-assessor agreement can reduce the measured performance gains predicted by increased precision.

References

[1]

A. Al-Maskari, M. Sanderson, P. Clough, and E. Airio. The good and the bad system: does the test collection predict users' effectiveness? In SIGIR'08, pages 59--66. ACM, 2008.

Digital Library

Google Scholar

[2]

J. Allan, B. Carterette, and J. Lewis. When will information retrieval be "good enough"? In SIGIR'05, pages 433--440. ACM, 2005.

Digital Library

Google Scholar

[3]

E. W. Bailey, D. Kelly, and K. Gyllstrom. Undergraduates' evaluations of assigned search topics. In SIGIR'09, pages 812--813. ACM, 2009.

Digital Library

Google Scholar

[4]

C. Cleverdon. The Cranfield tests on index language devices. In Aslib Proceedings, volume 19, pages 172--192, 1967.

Crossref

Google Scholar

[5]

C. Cleverdon, J. Mills, and M. Keen. Aslib Cranfield research project - factors determining the performance of indexing systems; volume 1, design; part 1, text. Technical report, Cranfield University, 1966. URI: http://hdl.handle.net/1826/861.

Google Scholar

[6]

G. V. Cormack, C. L. A. Clarke, and S. Buettcher. Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In SIGIR'09, pages 758--759. ACM, 2009.

Digital Library

Google Scholar

[7]

W. Hersh, A. Turpin, S. Price, B. Chan, D. Kramer, L. Sacherek, and D. Olson. Do batch and user evaluations give the same results? In SIGIR'00, pages 17--24. ACM, 2000.

Digital Library

Google Scholar

[8]

K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of IR techniques. TOIS, 20(4):422--446, 2002.

Digital Library

Google Scholar

[9]

A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. TOIS, 27(1):1--27, 2008.

Digital Library

Google Scholar

[10]

F. Scholer and A. Turpin. Relevance thresholds in system evaluations. In SIGIR'08, pages 693--694. ACM, 2008.

Digital Library

Google Scholar

[11]

C. L. Smith and P. B. Kantor. User adaptation: good results from poor systems. In SIGIR'08, pages 147--154. ACM, 2008.

Digital Library

Google Scholar

[12]

A. Turpin and F. Scholer. User performance versus precision measures for simple search tasks. In SIGIR'06, pages 11--18. ACM, 2006.

Digital Library

Google Scholar

[13]

A. H. Turpin and W. Hersh. Why batch and user evaluations do not give the same results. In SIGIR'01, pages 225--231. ACM, 2001.

Digital Library

Google Scholar

[14]

E. M. Voorhees. I come not to bury Cranfield, but to praise it. In HCIR'09, pages 13--16, 2009.

Google Scholar

Cited By

View all

Cormack GGrossman MHarbison AO'Halloran TMcManus BHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Unbiased Validation of Technology-Assisted Review for eDiscoveryProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657903(2677-2681)Online publication date: 10-Jul-2024
Liu JShah C(2022)Interactive IR User Study Design, Evaluation, and ReportingundefinedOnline publication date: 10-Mar-2022
Tawfik NSpruit M(2020)Computer-Assisted Relevance Assessment: A Case Study of Updating Systematic Medical ReviewsApplied Sciences10.3390/app1008284510:8(2845)Online publication date: 20-Apr-2020
Show More Cited By

Index Terms

Human performance and retrieval precision revisited
1. Information systems
  1. Information retrieval

Recommendations

Rank-biased precision for measurement of retrieval effectiveness

A range of methods for measuring the effectiveness of information retrieval systems has been proposed. These are typically intended to provide a quantitative single-value summary of a document ranking relative to a query. However, many of these measures ...
Human question answering performance using an interactive document retrieval system
IIIX '12: Proceedings of the 4th Information Interaction in Context Symposium

Every day, people answer their questions by using document retrieval systems. Compared to document retrieval systems, question answering (QA) systems aim to speed the rate at which users find answers by retrieving answers rather than documents. To ...
On information retrieval metrics designed for evaluation with incomplete relevance assessments

Modern information retrieval (IR) test collections have grown in size, but the available manpower for relevance assessments has more or less remained constant. Hence, how to reliably evaluate and compare IR systems using incomplete relevance data, ...

Comments

Information & Contributors

Information

Published In

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

31
Total Citations
View Citations
451
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Cormack GGrossman MHarbison AO'Halloran TMcManus BHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Unbiased Validation of Technology-Assisted Review for eDiscoveryProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657903(2677-2681)Online publication date: 10-Jul-2024
Liu JShah C(2022)Interactive IR User Study Design, Evaluation, and ReportingundefinedOnline publication date: 10-Mar-2022
Tawfik NSpruit M(2020)Computer-Assisted Relevance Assessment: A Case Study of Updating Systematic Medical ReviewsApplied Sciences10.3390/app1008284510:8(2845)Online publication date: 20-Apr-2020
Wicaksono AMoffat ACaverlee JHu XLalmas MWang W(2020)Metrics, User Models, and SatisfactionProceedings of the 13th International Conference on Web Search and Data Mining10.1145/3336191.3371799(654-662)Online publication date: 20-Jan-2020
Liu JShah C(2019)Interactive IR User Study Design, Evaluation, and ReportingSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00923ED1V01Y201905ICR06711:2(i-75)Online publication date: 3-Jun-2019
Zhang HCormack GGrossman MSmucker M(2019)Evaluating sentence-level relevance feedback for high-recall information retrievalInformation Retrieval Journal10.1007/s10791-019-09361-0Online publication date: 13-Aug-2019
Zhang HAbualsaud MGhelani NSmucker MCormack GGrossman MCuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)Effective User Interaction for High-Recall RetrievalProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271796(187-196)Online publication date: 17-Oct-2018
Jiang JAllan JLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)Adaptive Persistence for Search Effectiveness MeasuresProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3133033(747-756)Online publication date: 6-Nov-2017
Roegiest ATan LLin JKando NSakai TJoho HLi Hde Vries AWhite R(2017)Online In-Situ Interleaved Evaluation of Real-Time Push Notification SystemsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3080808(415-424)Online publication date: 7-Aug-2017
Azzopardi LNordlie RPharo NFreund LLarsen BRussel D(2017)Building Cost-Benefit Models of Information InteractionsProceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval10.1145/3020165.3022162(425-428)Online publication date: 7-Mar-2017
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Rank-biased precision for measurement of retrieval effectiveness

Human question answering performance using an interactive document retrieval system

On information retrieval metrics designed for evaluation with incomplete relevance assessments