skip to main content
10.1145/2600428.2609534acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

The effect of expanding relevance judgements with duplicates

Published: 03 July 2014 Publication History

Abstract

We examine the effects of expanding a judged set of sentences with their duplicates from a corpus. Including new sentences that are exact duplicates of the previously judged sentences may allow for better estimation of performance metrics and enhance the reusability of a test collection. We perform experiments in context of the Temporal Summarization Track at TREC 2013. We find that adding duplicate sentences to the judged set does not significantly affect relative system performance. However, we do find statistically significant changes in the performance of nearly half the systems that participated in the Track. We recommend adding exact duplicate sentences to the set of relevance judgements in order to obtain a more accurate estimate of system performance.

References

[1]
KBA Stream Corpus 2013. http://trec-kba.org/kba-stream-corpus-2013.shtml.
[2]
J. Aslam, F. Diaz, M. Ekstrand-Abueg, V. Pavlu, and T. Sakai. TREC 2013 Temporal Summarization. In TREC, 2013.
[3]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR, pages 25--32, 2004.
[4]
G. Marton and A. Radul. Nuggeteer: Automatic nugget-based evaluation using descriptions and judgements. In HLT-NAACL, pages 375--382, 2006.
[5]
V. Pavlu and J. Aslam. A practical sampling strategy for efficient retrieval evaluation, Technical Report, College of Computer and Information Science, Northeastern University. 2007.
[6]
V. Pavlu, S. Rajput, P. B. Golbus, and J. A. Aslam. IR system evaluation using nugget-based test collections. In WSDM, pages 393--402, 2012.
[7]
T. Sakai and N. Kando. On information retrieval metrics designed for evaluation with incomplete relevance assessments. Information Retrieval, 11(5):447--470, 2008.
[8]
M. Sanderson and J. Zobel. Information retrieval system evaluation: effort, sensitivity, and reliability. In SIGIR, pages 162--169, 2005.
[9]
I. Soboroff, C. Nicholas, and P. Cahan. Ranking retrieval systems without relevance judgments. In SIGIR, pages 66--73, 2001.
[10]
A. Trotman and D. Jenkinson. IR evaluation using multiple assessors per topic. ADCS, 2007.
[11]
E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In SIGIR, pages 315--323, 1998.
[12]
E. M. Voorhees and C. Buckley. The effect of topic set size on retrieval experiment error. In SIGIR, pages 316--323, 2002.

Cited By

View all
  • (2019)On enhancing the robustness of timeline summarization test collectionsInformation Processing and Management: an International Journal10.1016/j.ipm.2019.02.00656:5(1815-1836)Online publication date: 1-Sep-2019
  • (2017)EveTAR: building a large-scale multi-task test collection over Arabic tweetsInformation Retrieval Journal10.1007/s10791-017-9325-721:4(307-336)Online publication date: 21-Dec-2017
  • (2016)Optimizing Nugget Annotations with Active LearningProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983694(2359-2364)Online publication date: 24-Oct-2016
  • Show More Cited By

Index Terms

  1. The effect of expanding relevance judgements with duplicates

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
    July 2014
    1330 pages
    ISBN:9781450322577
    DOI:10.1145/2600428
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 July 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. duplicate detection
    2. evaluation
    3. pooling

    Qualifiers

    • Poster

    Conference

    SIGIR '14
    Sponsor:

    Acceptance Rates

    SIGIR '14 Paper Acceptance Rate 82 of 387 submissions, 21%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)On enhancing the robustness of timeline summarization test collectionsInformation Processing and Management: an International Journal10.1016/j.ipm.2019.02.00656:5(1815-1836)Online publication date: 1-Sep-2019
    • (2017)EveTAR: building a large-scale multi-task test collection over Arabic tweetsInformation Retrieval Journal10.1007/s10791-017-9325-721:4(307-336)Online publication date: 21-Dec-2017
    • (2016)Optimizing Nugget Annotations with Active LearningProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983694(2359-2364)Online publication date: 24-Oct-2016
    • (2016)A Study of Realtime Summarization MetricsProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983653(2125-2130)Online publication date: 24-Oct-2016
    • (2015)Online News Tracking for Ad-Hoc Information NeedsProceedings of the 2015 International Conference on The Theory of Information Retrieval10.1145/2808194.2809474(221-230)Online publication date: 27-Sep-2015
    • (2015)Evaluating Streams of Evolving News EventsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767751(675-684)Online publication date: 9-Aug-2015
    • (2015)Entity-Centric Stream Filtering and Ranking: Filtering and Unfilterable DocumentsAdvances in Information Retrieval10.1007/978-3-319-16354-3_33(303-314)Online publication date: 2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media