skip to main content
10.1145/1571941.1572019acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Towards methods for the collective gathering and quality control of relevance assessments

Published:19 July 2009Publication History

ABSTRACT

Growing interest in online collections of digital books and video content motivates the development and optimization of adequate retrieval systems. However, traditional methods for collecting relevance assessments to tune system performance are challenged by the nature of digital items in such collections, where assessors are faced with a considerable effort to review and assess content by extensive reading, browsing, and within-document searching. The extra strain is caused by the length and cohesion of the digital item and the dispersion of topics within it. We propose a method for the collective gathering of relevance assessments using a social game model to instigate participants' engagement. The game provides incentives for assessors to follow a predefined review procedure and makes provisions for the quality control of the collected relevance judgments. We discuss the approach in detail, and present the results of a pilot study conducted on a book corpus to validate the approach. Our analysis reveals intricate relationships between the affordances of the system, the incentives of the social game, and the behavior of the assessors. We show that the proposed game design achieves two designated goals: the incentive structure motivates endurance in assessors and the review process encourages truthful assessment.

References

  1. Bailey, P., Craswell, N., Soboroff, I., Thomas, P., de Vries, A. P., and Yilmaz, E. 2008. Relevance assessment: are judges exchangeable and does it matter. In Proc. of 31st ACM SIGIR (Singapore). ACM, New York, NY, 667--674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Clark, P. B. and J. Q. Wilson. 1961. "Incentive Systems: A Theory of Organizations." Administrative Science Quarterly 6:129--26.Google ScholarGoogle ScholarCross RefCross Ref
  3. Cormack, G. V. and Lynam, T. R. 2007. Power and bias of subset pooling strategies. In Proc. of 30th ACM SIGIR (Amsterdam). ACM, New York, NY, 837--838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fuhr, N., Kamps, J., Lalmas, M., Malik, S., and Trotman, A. 2007. Overview of the INEX 2007 ad hoc track. In Proc. of INEX'07. 1--22.Google ScholarGoogle Scholar
  5. Kazai, G., Doucet, A., Landoni, M. 2009. Overview of the INEX 2008 Book Track. In Proc. of INEX'08. LNCS Vol. 5613, Springer.Google ScholarGoogle Scholar
  6. Piwowarski, B., Trotman, A., and Lalmas, M. 2008. Sound and complete relevance assessment for XML retrieval. ACM Trans. Inf. Syst. 27(1), 1--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sanderson, M. and Joho, H. 2004. Forming test collections with no system pooling. In Proc. of 27th ACM SIGIR (Sheffield).ACM, New York, NY, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Soboroff, I., Nicholas, C., and Cahan, P. 2001. Ranking retrieval systems without relevance judgments. In Proc. of 24th ACM SIGIR (New Orleans). ACM, New York, 66--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Spink, A. and Greisdorf, H. 2001. Regions and levels: measuring and mapping users'' relevance judgments. J. Am. Soc. Inf. Sci. Technol. 52(2), 161--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Trotman, A., Pharo, N.&Lehtonen (2006). XML-IR users and use cases. In Pre-Proc. of INEX'06, 274--286.Google ScholarGoogle Scholar
  11. Trotman, A. and Jenkinson, D. 2007. IR evaluation using multiple assessors per topic. In Proc. of ADCS.Google ScholarGoogle Scholar
  12. von Ahn, L. and Dabbish, L. 2008. Designing games with a purpose. Commun. ACM 51(8), 58--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proc. of SIGCHI Conference on Human Factors in Comp. Systems (Vienna). ACM, NY, 319--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Voorhees, E. M. and Tice, D. M. 2000. Building a question answering test collection. In Proc. of 23rd ACM SIGIR (Athens). ACM, New York, NY, 200--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Voorhees, E. M. and Harman, D. K. 2005 TREC: Experiment and Evaluation in Information Retrieval. The MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yilmaz, E., Kanoulas, E., and Aslam, J. A. 2008. A simple and efficient sampling method for estimating AP and NDCG. In Proc. of 31st ACM SIGIR. ACM, New York, 603--610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zobel, J. 1998. How reliable are the results of large-scale information retrieval experiments?. In Proc. of 21st ACM SIGIR (Melbourne). ACM, New York, NY, 307--314. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards methods for the collective gathering and quality control of relevance assessments

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
      July 2009
      896 pages
      ISBN:9781605584836
      DOI:10.1145/1571941

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 July 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader