skip to main content
10.1145/2983323.2983694acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Optimizing Nugget Annotations with Active Learning

Published:24 October 2016Publication History

ABSTRACT

Nugget-based evaluations, such as those deployed in the TREC Temporal Summarization and Question Answering tracks, require human assessors to determine whether a nugget is present in a given piece of text. This process, known as nugget annotation, is labor-intensive. In this paper, we present two active learning techniques that prioritize the sequence in which candidate nugget/sentence pairs are presented to an assessor, based on the likelihood that the sentence contains a nugget. Our approach builds on the recognition that nugget annotation is similar to high-recall retrieval, and we adapt proven existing solutions. Simulation experiments with four existing TREC test collections show that our techniques yield far more matches for a given level of effort than baselines that are typically deployed in previous nugget-based evaluations.

References

  1. J. Allan. HARD Track Overview in TREC 2004 High Accuracy Retrieval fromDocuments. TREC, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. A. Aslam, M. Ekstrand-Abueg, V. Pavlu, F. Diaz, and T. Sakai. TREC 2013 Temporal Summarization. TREC, 2013.Google ScholarGoogle Scholar
  3. J. A. Aslam, M. Ekstrand-Abueg, V. Pavlu, F. Diaz, and T. Sakai. TREC 2014 Temporal Summarization. TREC, 2014.Google ScholarGoogle Scholar
  4. L. Azzopardi and G. Zuccon. Building and Using Models of Information Seeking Search and Retrieval: Full Day Tutorial. SIGIR, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Baruah, A. Roegiest, and M. D. Smucker. The Effect of Expanding Relevance Judgements with Duplicates. SIGIR, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. L. A. Clarke and M. D. Smucker. Time Well Spent. IIiX, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Cormack and M. Grossman. Engineering Quality and Reliability in Technology-Assisted Review. SIGIR, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. V. Cormack and M. R. Grossman. Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery. SIGIR, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. V. Cormack and M. R. Grossman. Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review. CoRR, abs/1504.06868, 2015.Google ScholarGoogle Scholar
  10. M. Ekstrand-Abueg. Personal Communication. 2014Google ScholarGoogle Scholar
  11. D. Harman. Information Retrieval Evaluation. Synthesis Lectures on Information Concepts, Retrieval, and Services, 3(2), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. He, M. Bron, A. de Vries, L. Azzopardi, and M. de Rijke. Untangling Result List Refinement and Ranking Quality: A Framework for Evaluation and Prediction. SIGIR, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Lin and D. Demner-Fushman. Automatically Evaluating Answers to Definition Questions. NAACL-HLT, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Marton and A. Radul. Nuggeteer: Automatic Nugget-Based Evaluation using Descriptions and Judgements. NAACL-HLT, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Pavlu, S. Rajput, P. B. Golbus, and J. A. Aslam. IR System Evaluation using Nugget-based Test Collections. WSDM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Rajput, M. Ekstrand-Abueg, V. Pavlu, and J. A. Aslam. Constructing Test Collections by Inferring Document Relevance via Extracted Relevant Information. CIKM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Roegiest, G. Cormack, M. Grossman, and C. Clarke. TREC 2015 Total Recall Track Overview. TREC, 2015.Google ScholarGoogle Scholar
  18. E. M. Voorhees. Overview of the TREC 2004 Question Answering Track. TREC, 2004.Google ScholarGoogle Scholar
  19. E. M. Voorhees. Overview of the TREC 2005 Question Answering Track. TREC, 2005.Google ScholarGoogle Scholar
  20. E. M. Voorhees and D. K. Harman. TREC: Experiment and Evaluation in Information Retrieval, MIT press Cambridge, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimizing Nugget Annotations with Active Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
      October 2016
      2566 pages
      ISBN:9781450340731
      DOI:10.1145/2983323

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader