skip to main content
10.1145/3340531.3412121acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Building Test Collections using Bandit Techniques: A Reproducibility Study

Authors Info & Claims
Published:19 October 2020Publication History

ABSTRACT

The high cost of constructing test collections led many researchers to develop intelligent document selection methods to find relevant documents with fewer judgments than the standard pooling method requires. In this paper, we conduct a comprehensive set of experiments to evaluate six bandit-based document selection methods, in terms of evaluation reliability, fairness, and reusability of the resultant test collections. In our experiments, the best performing method varies across test collections, showing the importance of using diverse test collections for an accurate performance analysis. Our experiments with six test collections also show that Move-To-Front is the most robust method among the ones we investigate.

Skip Supplemental Material Section

Supplemental Material

3340531.3412121.mp4

mp4

15.6 MB

References

  1. Allan, J., Harman, D., Kanoulas, E., Li, D., Van Gysel, C., Voorhees, E.M.: Trec 2017 common core track overview. In: TREC (2017)Google ScholarGoogle Scholar
  2. Aslam, J.A., Pavlu, V., Savell, R.: A unified model for metasearch, pooling, and system evaluation. In: Proceedings of the twelfth international conference on Information and knowledge management. pp. 484--491. ACM (2003)Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Carterette, B.: On rank correlation and the distance between rankings. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. pp. 436--443. ACM (2009)Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cleverdon, C.W.: The evaluation of systems used in information retrieval. In: Proceedings of the international conference on scientific information. vol. 1, pp. 687--698. National Academy of Sciences Washington, DC, (1959)Google ScholarGoogle Scholar
  5. Cormack, G.V., Palmer, C.R., Clarke, C.L.: Effcient construction of large test collections. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-98). Citeseer (1998)Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lipani, A., Losada, D.E., Zuccon, G., Lupu, M.: Fixed-cost pooling strategies. IEEE Transactions on Knowledge and Data Engineering (2019)Google ScholarGoogle Scholar
  7. Lipani, A., Palotti, J., Lupu, M., Piroi, F., Zuccon, G., Hanbury, A.: Fixed-cost pooling strategies based on ir evaluation measures. In: European Conference on Information Retrieval. pp. 357--368. Springer (2017)Google ScholarGoogle ScholarCross RefCross Ref
  8. Losada, D.E., Parapar, J., Barreiro, A.: Multi-armed bandits for adjudicating documents in pooling-based evaluation of information retrieval systems. Information Processing & Management 53(5), 1005--1025 (2017)Google ScholarGoogle ScholarCross RefCross Ref
  9. Moffat, A., Webber, W., Zobel, J.: Strategic system comparisons via targeted relevance judgments. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 375--382. ACM (2007)Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems (TOIS) 27(1), 2 (2008)Google ScholarGoogle Scholar
  11. Rahman, M.M., Kutlu, M., Lease, M.: Constructing test collections using multiarmed bandits and active learning. In: The World Wide Web Conference. pp. 3158--3164. ACM (2019)Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sakai, T.: Topic set size design. Information Retrieval Journal 19(3), 256--283 (2016)Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sparck Jones, K., Van Rijsbergen, C.: Report on the need for and provision of an" ideal. Information Retrieval Test Collection (1975)Google ScholarGoogle Scholar
  14. Urbano, J., Marrero, M., Martín, D.: On the measurement of test collection reliability. In: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. pp. 393--402. ACM (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. Information processing & management 36(5), 697--716 (2000).Google ScholarGoogle Scholar
  16. Voorhees, E.M.: The philosophy of information retrieval evaluation. In:Workshop of the cross-language evaluation forum for european languages. pp. 355--370 (2001)Google ScholarGoogle Scholar
  17. Voorhees, E.M.: Topic set size redux. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. pp. 806--807. ACM (2009)Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Voorhees, E.M.: On building fair and reusable test collections using bandit techniques. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. pp. 407--416. ACM (2018)Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Building Test Collections using Bandit Techniques: A Reproducibility Study

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
      October 2020
      3619 pages
      ISBN:9781450368599
      DOI:10.1145/3340531

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 October 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader