skip to main content
10.1145/2854946.2854993acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Are Secondary Assessors Uncertain When They Disagree About Relevance Judgements?

Published:13 March 2016Publication History

ABSTRACT

The collection of relevance judgements by assessors is important for many information retrieval (IR) tasks. In addition to the construction of test collections, relevance judging is critical to e-discovery and other applications where many assessors are hired to perform relevance judging. It is well known that assessors may differ in their judgements for a given document. One possible cause of a judgement difference is that an assessor may be uncertain in their judgement and thus may in effect be guessing the document's relevance. If assessors are aware of their uncertainty and can self-report their level of certainty, then uncertain relevance judgements can be targeted for adjudication by additional assessors. In this paper, we conducted a user study with 48 participants to test our hypothesis that assessors will be uncertain about their relevance judgements when the assessors are likely to disagree with each other. We found that for low consensus documents, i.e. documents known for assessor disagreement, assessors judge these documents with almost as much certainty as high consensus documents. In particular, assessor self-reported uncertainty is predictive of disagreement only for high consensus documents and not for low consensus documents.

References

  1. A. L. Al-Harbi and M. D. Smucker. A Qualitative Exploration of Secondary Assessor Relevance Judging Behavior. In IIiX, pages 195--204, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Bailey, N. Craswell, I. Soboroff, P. Thomas, A. P. de Vries, and E. Yilmaz. Relevance assessment: are judges exchangeable and does it matter. In SIGIR, pages 667--674, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. W. Cleverdon. The Effect of Variations in Relevance Assessments in Comparative Experimental Tests of Index Languages. Tech. report, Cranfield Univ.; Aslib, 1970.Google ScholarGoogle Scholar
  4. C. Jethani. Effect of Prevalence on Relevance Assessing Behavior. Master's Thesis, University of Waterloo, 2011.Google ScholarGoogle Scholar
  5. M. E. Lesk and G. Salton. Relevance Assessments and Retrieval System Evaluation. Information Storage and Retrieval, 4(4):343--359, 1968.Google ScholarGoogle ScholarCross RefCross Ref
  6. M. D. Smucker and C. Jethani. Human Performance and Retrieval Precision Revisited. In SIGIR, pages 595--602, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. D. Smucker and C. Jethani. The Crowd vs. the Lab: A Comparison of Crowd-Sourced and University Laboratory Participant Behavior. In Proceedings of the SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval, 2011.Google ScholarGoogle Scholar
  8. E. M. Voorhees. Variations in Relevance Judgments and the Measurement of Retrieval Effectiveness. Information Processing & Management, 36(5):697--716, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. M. Voorhees. Overview of the TREC 2005 Robust Retrieval Track. In TREC, 2005.Google ScholarGoogle Scholar
  10. J. Wang and D. Soergel. A User Study of Relevance Judgments for E-discovery. In ASIST, 47(1):1--10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Webber, P. Chandar, and B. Carterette. Alternative Assessor Disagreement and Retrieval Depth. In CIKM, pages 125--134, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Are Secondary Assessors Uncertain When They Disagree About Relevance Judgements?

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              CHIIR '16: Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval
              March 2016
              400 pages
              ISBN:9781450337519
              DOI:10.1145/2854946

              Copyright © 2016 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 13 March 2016

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • short-paper

              Acceptance Rates

              CHIIR '16 Paper Acceptance Rate23of58submissions,40%Overall Acceptance Rate55of163submissions,34%
            • Article Metrics

              • Downloads (Last 12 months)2
              • Downloads (Last 6 weeks)0

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader