skip to main content
10.1145/2065003.2065020acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Optimizing the cost of information retrieval testcollections

Published: 28 October 2011 Publication History

Abstract

We consider the problem of optimally allocating limited resources to construct relevance judgements for a test collection that facilities reliable evaluation of retrieval systems. We assume that there is a large set of test queries, for each of which a large number of documents need to be judged though the available budget only permits to judge a subset of them. A candidate solution to this problem has to deal with, at least, three challenges. (i) Given a fixed budget it has to efficiently select a subset of query-documents pairs for acquiring relevance judgements. (ii) With collected relevance judgements it has to be able to not only accurately evaluate a set of systems participating in a test collection construction but also reliably assess the performance of new as yet unseen systems. (iii) Finally, it has to properly deal with uncertainty that is due to (a) the presence of unjudged documents in a rank list, (b) the presence of queries with no relevance judgements, and (c) errors caused by human assessors when labelling documents. In this thesis we propose an optimisation framework that accommodates appropriate solutions for each of the three challenges. Our approach is aimed to be of benefit to construct IR test collections by research institutes, e.g. NIST, or commercial search engines, e.g. Google and Bing, where there are large scale documents collections and loads of query logs however economic constraints prohibit gathering comprehensive relevance judgements.

References

[1]
J. Zobel, "How reliable are the results of large-scale information retrieval experiments," in Proceeding of ACM SIGIR Special Interest Group on Information Retrieval, 1998, pp. 307--314.
[2]
E. Yilmaz and J. Aslam, "Estimating average precision with incomplete and imperfect judgments," in Proceedings of the 15th ACM international conference on Information and knowledge management, 2006, pp. 102--111.
[3]
B. Carterette, V. Pavlu, F. Hui, and E. Kanoulas, "Million Query Track 2009 Overview," TREC'09, Text Retrieval Conference, 2009.
[4]
J. Guiver, S. Mizzaro, and S. Robertson, "A few good topics: Experiments in topic set reduction for retrieval evauluation," ACM Transactions of Information Systems, vol. 27, no. 4, pp. 1--26, 2009.
[5]
S. Robertson, "On the contributions of topics to system evaluation," in Advances in Information Retrieval, 33th European Conference on IR Research (to appear), 2011.
[6]
M. Hosseini, I. J. Cox, N. Milic-Frayling, T. Sweeting, and V. Vinay, "Prioritizing Relevance Judgments to Improve the Construction of IR Test Collections," in 20th ACM Conference on Information and Knowledge Management, Glasgow, 2011.
[7]
M. Hosseini, I. Cox, N. Milic-Frayling, V. Vinay, and T. Sweeting, "Selecting a Subset of Queires for Acquisition of further Relevance Judgements," in 3rd international Conference on the Theory of Information Retrieval, 2011, pp. 113--124.
[8]
H. P. Benson, "Fractional programming with convex quadratic forms and functions," European Journal of Operational Research, vol. 173, no. 2, pp. 351--369, Sep. 2006.
[9]
R. W. Cottle, J.-S. Pang, and R. E. Stone, The linear complementarity problem. Boston, London: Academic Press Inc, 1992.
[10]
B. Carterette, E. Gabrilovich, V. Josifovski, and D. Metzler, "Measuring the Reusbility of Test Collections," in Proceeding of ACM International conference on Web Search and Data Mining, New York, 2010, pp. 231--240.

Cited By

View all
  • (2016)A Short Survey on Online and Offline Methods for Search Quality EvaluationInformation Retrieval10.1007/978-3-319-41718-9_3(38-87)Online publication date: 26-Jul-2016
  • (2015)New Research Directions in Knowledge Discovery and Allied SpheresACM SIGKDD Explorations Newsletter10.1145/2783702.278370816:2(46-49)Online publication date: 21-May-2015
  • (2011)PIKM 2011Proceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2064049(2633-2634)Online publication date: 24-Oct-2011

Index Terms

  1. Optimizing the cost of information retrieval testcollections

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PIKM '11: Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
      October 2011
      100 pages
      ISBN:9781450309530
      DOI:10.1145/2065003
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 October 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. convex optimisation
      2. evaluation
      3. resource allocation
      4. test collection

      Qualifiers

      • Poster

      Conference

      CIKM '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 25 of 62 submissions, 40%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)A Short Survey on Online and Offline Methods for Search Quality EvaluationInformation Retrieval10.1007/978-3-319-41718-9_3(38-87)Online publication date: 26-Jul-2016
      • (2015)New Research Directions in Knowledge Discovery and Allied SpheresACM SIGKDD Explorations Newsletter10.1145/2783702.278370816:2(46-49)Online publication date: 21-May-2015
      • (2011)PIKM 2011Proceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2064049(2633-2634)Online publication date: 24-Oct-2011

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media